Spark: reading many files with read.csv

Question

I would like to create a DataFrame from many small files located in the same directory. I plan to use read.csv from pyspark.sql. I've learned that in RDD world, textFile function is designed for reading small number of large files, whereas wholeTextFiles function is designed for reading a large number of small files (e.g. see this thread). Does read.csv use textFile or wholeTextFiles under the hood?

Spark: reading many files with read.csv

Answers (1)

Related Questions