Manoj Kumar
Manoj Kumar

Reputation: 757

Reading Properties File in Pyspark

I wanted to read .ini files ( which are my configuration / properties files ) in my spark 1.6.0 application. For that I'm using ConfigParser to read the properties files.

import ConfigParser
import os
config = ConfigParser.ConfigParser()
config.read(os.path.join(os.path.dirname(__file__), 'config.ini'))

print 'config sections : ', config.sections()

It is returning the empty list as a result. I tried submitting my job in both client and cluster mode, Both the way it is failing to run the job. Please let me know if I'm doing any mistake here while reading the files.

Upvotes: 1

Views: 5127

Answers (1)

Marcin
Marcin

Reputation: 693

It is possible to read config files. You just need to either package your code into ad egg or pass the config file during the spark-submit like:

spark-submit --master yarn --deploy-mode cluster --py-files conf/config.ini my_pyspark_script.py

Or if running from egg file (which will contain your python modules and the config.ini)

spark-submit --master yarn --deploy-mode cluster files --py-files my.egg my_pyspark_script.py
configFile = resource_filename(Requirement.parse("myapp"), "conf/config.ini")
config = ConfigParser.ConfigParser()
config.read(configFile)

Upvotes: 1

Related Questions