Reputation: 45
I want to run a code where it can ingest the data using jdbc driver and save it into a file path. It successfully ingested the data but the saving function didnt worked. I know that we can use code like this to save data:
a.write.mode("overwrite").parquet("test/partition_test.parquet")
Is there any way I can set the file path as a parameter? I've tried set the parameter like below but it didnt worked.
my code:
def ingest(spark, db_url, tablename, username, password,destination, driver, save_format="parquet"):
a = spark.read.format("jdbc").option("url",db_url).option("dbtable",tablename).option("user", username).option("password",password).option("path", destination).option("driver",driver).load()
return a
ingest(spark, "jdbc:mysql://192.168.122.1:3306/users", "users", "root", "123456@h21","/path", "com.mysql.jdbc.Driver", save_format="parquet")
Upvotes: 0
Views: 620
Reputation: 87069
You're mixing two things together in your code. What you need to do is done in the 2 steps:
so the code needs to be something like this:
def ingest(spark, db_url, tablename, username, password, destination,
driver, save_format="parquet"):
a = spark.read.format("jdbc").option("url",db_url)\
.option("dbtable",tablename).option("user", username)\
.option("password",password).option("driver",driver).load()
a.write.format(save_format).save(destination)
return a
this function will return dataframe, but if you just need to read & write data, then you can return None
instead of dataframe.
Upvotes: 1