Reputation: 177
thanks for your time.
I need to read several file paths, which are divided into months and days (/mm/dd/*.json)
I've been trying to traverse the path associated with days, but my loop always sticks with the last read:
for i_dia in range(1, 9):
df_json = spark.read.json('/mnt/datalake/'+Year+'/'+ Month +'/'+ str(0) + str(i_dia) +'/'+ '*', mode="PERMISSIVE",multiLine = "true")
return df_json
display(df_json)
How should the correct reading be done? I want to read all files in only one big dataframe please.
From already thank you very much.
Regards
Upvotes: 0
Views: 138
Reputation: 104
import pandas as pd
df_json=pd.DataFrame()
for i_dia in range(1, 9):
df_json= pd.concat([df_json,pd.read_json(i_dia )])
Upvotes: 2