Bharat Bhushan
Bharat Bhushan

Reputation: 91

Spark read as jdbc return all rows as columns name

I'm querying a Spark's database table using Spark 3.x SQL in Scala 2.12. I've followed the examples given in the Internet.

The db I'm using: Spark SQL's database and using Centos 7. The table (example) I'm querying has the following columns:

create table example( tutorial_title VARCHAR(22) NOT NULL) ;
var example= spark.read.format("jdbc")
.option("url", "jdbc:hive2://localhost:10000/test2")
.option("dbtable", "example")
.option("user", "username")
.option("password", "123456")
.option("fetchsize", "10")
.option("driver", "org.apache.hive.jdbc.HiveDriver")
.load()

This gives me the following output:

+-------+-------
|tutorial_title|
+-------+-------
|tutorial_title|
|tutorial_title|
|tutorial_title|
+-------+-------

i.e. repeats the column names for each row without giving the data. My table has 3 rows. I've tried changing the number of rows in db and my output changes accordingly.

If I use ./bin/spark-sql and select the table, its shows me the actual records. But ./bin/Spark-shell give the column names as results/records.

spark-sql and beeline client having the correct records But Spark's read.format("jdbc") and Spark-shell is having the above wrong records.

Upvotes: 6

Views: 2631

Answers (1)

Shakir Shakeel
Shakir Shakeel

Reputation: 91

Hey can you try by giving the table name like this.

var example= spark.read.format("jdbc")
.option("url", "jdbc:hive2://localhost:10000/test2")
.option("dbtable", "test2.example")
.option("user", "username")
.option("password", "123456")
.option("fetchsize", "10")
.option("driver", "org.apache.hive.jdbc.HiveDriver")
.load()

Even when you are trying to fetch using query, instead of using

select * from table_name

you should use this

select * from db_name.table_name

Upvotes: 1

Related Questions