How quantify the reading progress of large CSV files through pd.read_csv and chunks?

Question

Analogy/Example

Let's say I have a list:

test_list = [2, 5, 3, 6]
number_of_elements = len(test_list)

Then enumerate can be used with number_of_elements to track the progress of a loop as follows:

for j, element in enumerate(test_list):
    do something
    print('completed {} out of {}'.format(j, number_of_elements))

Question

Large csv files can be read as shown below (reference answer):

chunksize = 10 ** 6
for chunk in pd.read_csv(filename, chunksize=chunksize):
    process(chunk)

How to track the progress of this loop?

Attempt

file_chunks = pd.read_csv(file_name, chunksize=100000)
number_of_chunks = len(file_chunks)
for j, chunk in enumerate(pd.read_csv(file_name, chunksize=100000)):
    print(j, number_of_chunks)

Following is the error:

TypeError: object of type 'TextFileReader' has no len()

How quantify the reading progress of large CSV files through pd.read_csv and chunks?

Analogy/Example

Question

Attempt

Answers (1)

Related Questions