ryan1992
ryan1992

Reputation: 19

Taking Multiple Parquet Files and converting them to CSV Outputs

New to Python here

I have a folder with multiple parquet files as shown below (there are close to twenty). I would like to convert all of them to separate csv files published on my desktop. Any guidance on a standard code I could leverage to do this? Assume that the structure within them are all the same.

Thanks so much.

File1.parquet
File2.parquet
File3.parquet

to

File1.csv
File2.csv
File3.csv

Upvotes: 1

Views: 2461

Answers (1)

Mohcine Chekroune
Mohcine Chekroune

Reputation: 31

import pandas as pd
import glob
from fastparquet import ParquetFile

path = '.'  
prqtfiles = glob.glob(path + "/*.parquet")
for p in prqtfiles:
 pr = ParquetFile(p, sep='\t')
 df = pr.to_pandas()
 df.to_csv(p[:-8] + '.csv', index=False)

Upvotes: 3

Related Questions