Reputation: 91
I'm querying a Spark's database table using Spark 3.x SQL in Scala 2.12. I've followed the examples given in the Internet.
The db I'm using: Spark SQL's database and using Centos 7. The table (example) I'm querying has the following columns:
create table example( tutorial_title VARCHAR(22) NOT NULL) ;
var example= spark.read.format("jdbc")
.option("url", "jdbc:hive2://localhost:10000/test2")
.option("dbtable", "example")
.option("user", "username")
.option("password", "123456")
.option("fetchsize", "10")
.option("driver", "org.apache.hive.jdbc.HiveDriver")
.load()
This gives me the following output:
+-------+-------
|tutorial_title|
+-------+-------
|tutorial_title|
|tutorial_title|
|tutorial_title|
+-------+-------
i.e. repeats the column names for each row without giving the data. My table has 3 rows. I've tried changing the number of rows in db and my output changes accordingly.
If I use ./bin/spark-sql
and select the table, its shows me the actual records. But ./bin/Spark-shell
give the column names as results/records.
spark-sql and beeline client having the correct records But Spark's read.format("jdbc")
and Spark-shell is having the above wrong records.
Upvotes: 6
Views: 2631
Reputation: 91
Hey can you try by giving the table name like this.
var example= spark.read.format("jdbc")
.option("url", "jdbc:hive2://localhost:10000/test2")
.option("dbtable", "test2.example")
.option("user", "username")
.option("password", "123456")
.option("fetchsize", "10")
.option("driver", "org.apache.hive.jdbc.HiveDriver")
.load()
Even when you are trying to fetch using query, instead of using
select * from table_name
you should use this
select * from db_name.table_name
Upvotes: 1