alex
alex

Reputation: 2193

Spark read database rows in chunks?

I am querying a database using spark.read.jdbc method that is quite large and getting the following error:

com.mysql.cj.jdbc.exceptions.PacketTooBigException: Packet for query is too large (15,913,800 > 4,194,304)

which indicates the retrieved data is too large.
I don't have the option to alter the database settings and I need to be able to retrieve all of the data so I would like to read the data in chunks and have the result be a dataframe. How can I achieve this?

For example, in python I can query a database using pandas and read it in chunks docs

Upvotes: 0

Views: 3137

Answers (1)

Alex Ott
Alex Ott

Reputation: 87154

If you look to the documentation, you can find the fetchsize option that you may pass to the spark.read.jdbc...

Upvotes: 1

Related Questions