Zhijun Liu
Zhijun Liu

Reputation: 61

Spark SQL result different from Hive SQL result

I performed the SQL statement, select cityroaddis from trip_db.tripTable where tripid='a0001' and day>'2020-09-09', in both hive shell and spark shell, but got totally different results.

The two results

Hive: cityroaddis Spark: cityroaddis
0.0 null

Notice:

Has anybody had such a problem before?

Upvotes: 2

Views: 1990

Answers (1)

Zhijun Liu
Zhijun Liu

Reputation: 61

The problem was solved after I added these two configurations.

spark.sql.hive.convertMetastoreOrc=false
spark.sql.hive.convertMetastoreParquet=false

spark.sql.hive.convertMetastoreParquet : When reading from and writing to Hive metastore Parquet tables, Spark SQL will try to use its own Parquet support instead of Hive SerDe for better performance. This behavior is controlled by the spark.sql.hive.convertMetastoreParquet configuration, and is turned on by default.

spark.sql.hive.convertMetastoreOrc: enables new ORC format to read/write Hive Tables.

Upvotes: 4

Related Questions