Abhishek Gupta
Abhishek Gupta

Reputation: 1193

Write a spark DataFrame to a table

I am trying to understand the spark DataFrame API method called saveAsTable.

I have following question

(I am new to big data processing, so pardon me if question is not phrased properly)

Upvotes: 2

Views: 372

Answers (2)

MihaiGhita
MihaiGhita

Reputation: 167

Yes, you can do. You table can be partitioned by a column, but can not use bucketing (its a problem between spark and hive).

Upvotes: 0

Vijay_Shinde
Vijay_Shinde

Reputation: 1352

Yes. Newly created table will be hive table and can be queried from Hive CLI(Only if the DataFrame is created from single input HDFS path i.e. from non-partitioned single input HDFS path).

Below is the documentation comment in DataFrameWriter.scala class. Documentation link

When the DataFrame is created from a non-partitioned HadoopFsRelation with a single input path, and the data source provider can be mapped to an existing Hive builtin SerDe (i.e. ORC and Parquet), the table is persisted in a Hive compatible format, which means other systems like Hive will be able to read this table. Otherwise, the table is persisted in a Spark SQL specific format.

Upvotes: 0

Related Questions