Pandas to_csv index=False not working when writing incremental chunks

Question

I'm writing a fixed-width file to CSV. Because the file is too large to read at once, I'm reading the file in chunks of 100000 and appending to CSV. This is working fine, however it's adding an index to the rows despite having set index = False.

How can I complete the CSV file without index?

infile = filename
outfile = outfilename
cols = [(0,10), (12,19), (22,29), (34,41), (44,52), (54,64), (72,80), (82,106), (116,144), (145,152), (161,169), (171,181)]

for chunk in pd.read_fwf(path, colspecs = col_spec, index=False, chunksize=100000):
chunk.to_csv(outfile,mode='a')

Ami Tavory · Accepted Answer

The to_csv method has a header parameter, indicating if to output the header. In this case, you probably do not want this for writes that are not the first write.

So, you could do something like this:

for i, chunk in enumerate(pd.read_fwf(...)):
    first = i == 0
    chunk.to_csv(outfile, header=first, mode='a')

Pandas to_csv index=False not working when writing incremental chunks

Answers (1)

Related Questions