Reputation: 9
I am loading some data from oracle db and i am trying to do count action on the data.But not getting any result and eventually getting session timeout. If i perform any other functions like first(),show() i am getting result .
I tried launching scala shell with different options like below.
spark2-shell --jars /oracle/12c/product/12.1.0/client/jdbc/lib/ojdbc6.jar --executor-memory 10G --num-executors 10 --driver-memory 2G yarn-client
spark2-shell --jars /oracle/12c/product/12.1.0/client/jdbc/lib/ojdbc6.jar
val df = spark.read
.format("jdbc")
.option("url","jdbc:oracle:thin:@//MY_IP:MY_PORT/MY_SERVICE_NAME")
.option("dbtable","MY_TABLE")
.option("query", "select 'a','6b',count(*) from MY_TABLE_NAME partition
for (to_date('07-MAY-2019','DD-MON-YYYY')) where COL='SOMETHING'")
.option("driver","oracle.jdbc.driver.OracleDriver")
.option("user", "MY_USERNAME")
.option("password", "MYPASSWORD")
.load()
df.count()
This is not providing any result
I am expecting this count function to give the total count of record got dumped from the table.
Upvotes: 0
Views: 139
Reputation: 11
There is no such option as query
so your code fetches all the data from the external table, furthermore it does that using only a single partition.
The correct option
is dbtable
and the query, as per linked answer, should be provided as a subquery:
.option("dbtable", "(select 'a','6b',count(*) from MY_TABLE_NAME partition
for (to_date('07-MAY-2019','DD-MON-YYYY')) where COL='SOMETHING') as t")
Upvotes: 1