Reputation: 343
Can anyone tell me which way to insert in oracle is more performatico?
Write.format('jdbc') mode or using CX_Oracle?
In my project I came across a case where they use write.format('jdbc') to INSERT and CX_Oracle to UPDATE, so I'm thinking of changing to INSERT and UPDATE on the same CX_Oracle connection, what do you think ?
Upvotes: 1
Views: 634
Reputation: 2345
I has worked on similar usecase. Here are some takeaway from my last project.
cx_oracle
is very slow compared to write.format('jdbc')
. I was inserting 1M records and there was drastic difference b/w those two approach. cx_oracle
even with executeMany didn't help much. I will strongly recommend to use spark JDBC.
Even in case of update, I ended up doing delete (SQL Query) - insert (using pyspark), because couldn't achieve update in spark and the alternative was very slow.
Spark does parallel writes while inserting to db too.
Even for read operation use spark jdbc read because spark will optimize the job and send projection and filtering at DB directly.
Upvotes: 1