Balkrishan Aggarwal
Balkrishan Aggarwal

Reputation: 613

Various file/data formats supported in Spark

I came across the below code somewhere.

sqlContext.read.format("com.databricks.spark.csv")

Looks like com.databricks.spark.csv is file format supported by Databricks. I want to know what all file/data formats are supported natively by Apache-Spark (prior to 2.0.0)

Upvotes: 2

Views: 3690

Answers (1)

Premkumar S
Premkumar S

Reputation: 41

Spark supports all the formats supported by Hadoop eco system. Below are the formats works well with spark.

1.textfile

2.sequencefile

3.json

4.avro (lightweight and fast serialization/deserialization)

5.parquet(column oriented and gives better compression ratio)

Upvotes: 2

Related Questions