Reputation: 1178
I am trying to store dataframe into an external hive table. When I perform the following action:
recordDF.write.option("path", "hdfs://quickstart.cloudera:8020/user/cloudera/hadoop/hive/warehouse/VerizonProduct").saveAsTable("productstoreHTable")
At the hdfs location where the table was supposed to be present instead I get this:
-rw-r--r-- 3 cloudera cloudera 0 2016-12-25 18:58 hadoop/hive/warehouse/VerizonProduct/_SUCCESS
-rw-r--r-- 3 cloudera cloudera 482 2016-12-25 18:58 hadoop/hive/warehouse/VerizonProduct/part-r-00000-0acdcc6d-893b-4e9d-b1d6-50bf02bea96a.snappy.parquet
-rw-r--r-- 3 cloudera cloudera 482 2016-12-25 18:58 hadoop/hive/warehouse/VerizonProduct/part-r-00001-0acdcc6d-893b-4e9d-b1d6-50bf02bea96a.snappy.parquet
-rw-r--r-- 3 cloudera cloudera 482 2016-12-25 18:58 hadoop/hive/warehouse/VerizonProduct/part-r-00002-0acdcc6d-893b-4e9d-b1d6-50bf02bea96a.snappy.parquet
-rw-r--r-- 3 cloudera cloudera 482 2016-12-25 18:58 hadoop/hive/warehouse/VerizonProduct/part-r-00003-0acdcc6d-893b-4e9d-b1d6-50bf02bea96a.snappy.parquet
How do I store it as uncompressed text format?
Thanks
Upvotes: 0
Views: 2838
Reputation: 619
Try this .option("fileFormat", "texfile")
. Look at Specifying storage format for Hive tables
Upvotes: 0
Reputation: 1528
The above solution with format csv, threw a warning "Couldn't find corresponding Hive SerDe for data source provider csv.". The table is not created in the desired way. One solution could be create an external table as below
sqlContext.sql("CREATE EXTERNAL TABLE test(col1 int,col2 string) STORED AS TEXTFILE LOCATION '/path/in/hdfs'")
.
Then
dataFrame.write.format("com.databricks.spark.csv").option("header", "true").save("/path/in/hdfs")
Upvotes: 1
Reputation: 1712
You can add format
option:
recordDF.write.option("path", "...").format("text").saveAsTable("...")
or
recordDF.write.option("path", "...").format("csv").saveAsTable("...")
Upvotes: 1