The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.

Launching Jupyter

  • Start the VPN
  • Connect to hadoop.cesga.es
  • Go to your working directory
  • Launch the Jupyter server
  • Type a password to protect the notebook
  • Point your browser to the address where the notebook is running:
    The Jupyter Notebook is running at:


Jupyter Terminal

Jupyter Conda

Jupyter Setup for sparkR

Jupyter requires some additional setup for running sparkR

X11 support

Start an SSH session with X11 support:
ssh -X login.hdp.cesga.es

Initialize Spark Context

Inside the notebook initialize the Spark context:

        .libPaths(c(file.path(Sys.getenv('SPARK_HOME'), 'R', 'lib'), .libPaths()))


        sc <- sparkR.init(master="yarn-client")
        sqlContext <- sparkRSQL.init(sc)