kr_ash
kr_ash

Reputation: 21

Read a part of a MySQL table in spark using JDBC connector

I am trying to read a table from MySQL database using a JDBC connector in pyspark. My script to read the table is :

query = "SELECT * FROM C WHERE hitId = 4235441"

readConfig = {
  "driver": driver,
  "url": url,
  "dbtable": tableName,
  "user": user,
  "password": password,
  "query_custom": query
}

saveLocation = mountPoint + "/" + tableName
print(saveLocation)

readDF = spark.read.format("jdbc").options(**readConfig).schema(tableSchema).load()
readDF.write.format("delta").option("mergeSchemas", "True").mode("overwrite").save(saveLocation)

I am trying to read only the particular rows which have a hitId of 4235441.

The issue is, still the whole table is being read instead of rows satisfying the custom query. Anyone can point out what is wrong in my script, or if anyone knows any other method to achieve objective?

I am stuck for quite a time, so any help is highly appreciated.

Upvotes: 0

Views: 618

Answers (1)

User12345
User12345

Reputation: 5480

In readConfig near dbtable option your are specifying the table_name. Instead try specifying the query like below

query = "SELECT * FROM C WHERE hitId = 4235441"

readConfig = {
  "driver": driver,
  "url": url,
  "dbtable": query,
  "user": user,
  "password": password,
}

Upvotes: 2

Related Questions