create dataframe from read many path files

Question

thanks for your time.

I need to read several file paths, which are divided into months and days (/mm/dd/*.json)

I've been trying to traverse the path associated with days, but my loop always sticks with the last read:

for i_dia in range(1, 9):
  df_json = spark.read.json('/mnt/datalake/'+Year+'/'+ Month +'/'+ str(0) + str(i_dia) +'/'+ '*', mode="PERMISSIVE",multiLine = "true")
  return df_json
 
display(df_json)

How should the correct reading be done? I want to read all files in only one big dataframe please.

From already thank you very much.

Regards

create dataframe from read many path files

Answers (1)

Related Questions