Reputation: 1996
I have the below to read all files within a directory, but I am struggling with getting the subdirectories too. I won't always know what the subdirectories are and hence cannot explicitly define it
Can anyone advise me please?
df = my_spark.read.format("csv").option("header", "true").load(yesterday+"/*.csv")
Upvotes: 1
Views: 2220
Reputation: 2165
Use Wildcards after the directory location where you wish to read all the sub directories.
"path/*/*"
Upvotes: 2
Reputation: 1996
Thanks to Joby
can you try giving wildcards in this way and see "path//" – Joby 23 hours ago
Upvotes: 0