How can I access pyspark installed inside hdfs headnode cluster

Question

I have a head node comprising of hadoop cluster.I see that pyspark is installed in hdfs cluster,i.e i am able to use pyspark shell inside hdfs user.But in user headnode pyspark is not installed. Therefore I am not able to access files from hdfs and bring it to pyspark.How can I use the pyspark inside hdfs in jupyter notebook.I installed pyspark in user head node but I am not able to access hdfs files.I am assuming that the jupyter is not able to use the spark which is installed in hdfs.How am I to enable it so that I can access hdfs files inside jupyter.

Now when I access hdfs files inside jupyter,

It says 'Spark is not installed'

I know its broad,If I have under emphasised or over emphsasied any point let me know in the comments

How can I access pyspark installed inside hdfs headnode cluster

Answers (1)

Related Questions