Reputation: 11
Hello i created a spark HD insight cluster on azure and i’m trying to read hive tables with pyspark but the proble that its show me only default database
Anyone have an idea ?
Upvotes: 1
Views: 1458
Reputation: 91
If you are using HDInsight 4.0, Spark and Hive not share metadata anymore.
For default you will not see hive tables from pyspark, is a problem that i share on this post: How save/update table in hive, to be readbale on spark.
But, anyway, things you can try:
These changes define hive metastore catalog as default. You can see hive databases and table now, but depending of table structure, you will not see the table data properly.
Upvotes: 1
Reputation: 534
You are missing details of hive server in SparkSession. If you haven't added any it will create and use default database to run sparksql.
If you've added configuration details in spark default conf file for spark.sql.warehouse.dir and spark.hadoop.hive.metastore.uris then while creating SparkSession add enableHiveSupport().
Else add configuration details while creating sparksession
.config("spark.sql.warehouse.dir","/user/hive/warehouse")
.config("hive.metastore.uris","thrift://localhost:9083")
.enableHiveSupport()
Upvotes: 0
Reputation: 4234
If you have created tables in other databases, try show tables from database_name
. Replace database_name
with the actual name.
Upvotes: 0