Gonza
Gonza

Reputation: 177

create dataframe from read many path files

thanks for your time.

I need to read several file paths, which are divided into months and days (/mm/dd/*.json)

I've been trying to traverse the path associated with days, but my loop always sticks with the last read:

for i_dia in range(1, 9):
  df_json = spark.read.json('/mnt/datalake/'+Year+'/'+ Month +'/'+ str(0) + str(i_dia) +'/'+ '*', mode="PERMISSIVE",multiLine = "true")
  return df_json
 
display(df_json)

How should the correct reading be done? I want to read all files in only one big dataframe please.

From already thank you very much.

Regards

Upvotes: 0

Views: 138

Answers (1)

Y U
Y U

Reputation: 104

import pandas as pd
df_json=pd.DataFrame()
for i_dia in range(1, 9):
        df_json= pd.concat([df_json,pd.read_json(i_dia )])

Upvotes: 2

Related Questions