Reputation: 185

How to save pandas data frame in spark into amazon s3?

I would like to save pandas dataframe into a s3 bucket. I tried below one which is already answered by somebody. But, It just gives me an error, AttributeError: 'DataFrame' object has no attribute 'write'.

df.write.format("com.databricks.spark.csv").save("s3n://id:pw@bucket")

Any idea? Thank you in advance.

Upvotes: 0

Answers (2)

Rajat Mishra

Reputation: 3780

One way is to convert the Pandas dataframe into spark dataframe and then you can use the spark csv package to save the file .

df.write.format("com.databricks.spark.csv").save("s3n://id:pw@bucket")

You can see this answer. Similar solution has been provided.

Upvotes: 1

maxymoo

Reputation: 36555

Are you using version 1.3 or earlier? In this case you just call save directly on the dataframe, i.e.

df.save(path="s3n://id:pw@bucket")

Upvotes: 0

How to save pandas data frame in spark into amazon s3?

Answers (2)

Related Questions