Reputation: 946
I am new to pyspark, to spark in general, and to AWS.
I tried saving a table using:
# Save distinct domains dataframe into SQL table
distinct_domains.write.saveAsTable('distinct_domains', mode='ignore', compression='lz4', header=True)
I thought I was saving a SQL table, but apparently this is a Hive table (which I just found out that exists).
I read on another post that it goes to the location s3://my_bucket_name/warehouse
And on yet another post that it goes to hdfs://user/hive/warehouse
I can't find this table anywhere. Please help.
Upvotes: 0
Views: 42
Reputation: 894
Probably you can give a try of below approach
1)
df_writer.partitionBy('col1')\
.saveAsTable('test_table', format='parquet', mode='overwrite',
path='s3a://bucket/foo')
2) You can create one temporary table using
myDf.createOrReplaceTempView("tempTable")
Then using the sqlcontext you can create hive table for the tempTable
sqlContext.sql("create table table_name as select * from tempTable");
Upvotes: 1