Reputation: 73
I am new to python and pyspark. I would like to know how can I write the below spark dataframe function in pyspark:
val df = spark.read.format("jdbc").options(
Map(
"url" -> "jdbc:someDB",
"user" -> "root",
"password" -> "password",
"dbtable" -> "tableName",
"driver" -> "someDriver")).load()
I tried to write as below in pyspark. But, getting syntax error:
df = spark.read.format("jdbc").options(
map(lambda : ("url","jdbc:someDB"), ("user","root"), ("password","password"), ("dbtable","tableName"), ("driver","someDriver"))).load()
Thanks in Advance
Upvotes: 3
Views: 9558
Reputation: 2407
In PySpark, pass the options as keyword arguments:
df = spark.read\
.format("jdbc")\
.options(
url="jdbc:someDB",
user="root",
password="password",
dbtable="tableName",
driver="someDriver",
)\
.load()
Sometimes it's handy to keep them in a dict
and unpack them later using the splat operator:
options = {
"url": "jdbc:someDB",
"user": "root",
"password": "password",
"dbtable": "tableName",
"driver": "someDriver",
}
df = spark.read\
.format("jdbc")\
.options(**options)\
.load()
Regarding the code snippets from your question: you happened to mix up two different concepts of "map":
Map
in Scala is a data structure also known as "associative array" or "dictionary", equivalent to Python's dict
map
in Python is a higher-order function you can use for applying a function to an iterable, e.g.:In [1]: def square(x: int) -> int:
...: return x**2
...:
In [2]: list(map(square, [1, 2, 3, 4, 5]))
Out[2]: [1, 4, 9, 16, 25]
In [3]: # or just use a lambda
In [4]: list(map(lambda x: x**2, [1, 2, 3, 4, 5]))
Out[4]: [1, 4, 9, 16, 25]
Upvotes: 20
Reputation: 11
To load a CSV file with multiple parameters, pass the arguments to load()
:
df = spark.read.load("examples/src/main/resources/people.csv",
format="csv", sep=":", inferSchema="true", header="true")
Here's the documentation for that.
Upvotes: 0
Reputation: 76
Try to use option()
instead:
df = spark.read \
.format("jdbc") \
.option("url","jdbc:someDB") \
.option("user","root") \
.option("password","password") \
.option("dbtable","tableName") \
.option("driver","someDriver") \
.load()
Upvotes: 4