Pandas extract comment lines

Question

I have a data file containing a first few lines of comments and then the actual data.

#param1 : val1
#param2 : val2
#param3 : val3
12
2
1
33
12
0
12
...

I can read the data as pandas.read_csv(filename, comment='#',header=None). However I also wish to separately read the comment lines in order to extract read the parameter values. So far I only came across skipping or removing the comment lines, but how to also separately extract the comment lines?

Elliot · Accepted Answer

In the call to read_csv you can't really. If you're just processing a header you can open the file, extract the commented lines and process them, then read in the data in a separate call.

from itertools import takewhile
with open(filename, 'r') as fobj:
    # takewhile returns an iterator over all the lines 
    # that start with the comment string
    headiter = takewhile(lambda s: s.startswith('#'), fobj)
    # you may want to process the headers differently, 
    # but here we just convert it to a list
    header = list(headiter)
df = pandas.read_csv(filename)

Pandas extract comment lines

Answers (2)

Related Questions