What’s new

In comparison with the previous BD|CESGA platform these are the main improvements:

  • Hadoop is now upgraded to Hadoop 3.
  • Spark 2.4 is now the default version.
  • HUE 4.
  • HDFS Erasure coding: allows to reduce storage overhead over default 3x replication.
  • Impala is now available as an alternative to Hive for interactive SQL queries.
  • The HOME system has been migrated from GlusterFS to the new Netapp storage system, this has greatly improved the latency of the HOME filesystem.
  • Improved reliability:
    • The HDFS NameNode is now in HA configuration.
    • The YARN ResourceManager is now in HA configuration.
  • Improved security:
    • SSL/TLS is now enabled for more secure communications.