Sahand Pourjavad
Sahand Pourjavad

Reputation: 29

How to load all csv files in a folder with pyspark

I have a folder which has

Sales_December.csv
Sales_January.csv
Sales_February.csv
etc.

How can i make pyspark read all of them into 1 dataframe?

Upvotes: 0

Views: 211

Answers (1)

Gprj
Gprj

Reputation: 32

  • create an empty list
  • read your csv files one by one and append DataFrames to the list
  • use reduce(DataFrame.unionAll, <list>) to combine them into one single DataFrame

Upvotes: 1

Related Questions