Python: Pandas, dealing with spaced column names

Question

If I have multiple text files that I need to parse that look like so, but can vary in terms of column names, and the length of the hashtags above:

How would I go about turning this into a pandas dataframe? I've tried using pd.read_table('file.txt', delim_whitespace = True, skiprows = 14), but it has all sorts of problems. My issues are...

All the text, asterisks, and pounds at the top needs to be ignored, but I can't just use skip rows because the size of all the junk up top can vary in length in another file.

The columns "stat (+/-)" and "syst (+/-)" are seen as 4 columns because of the whitespace.

The one pound sign is included in the column names, and I don't want that. I can't just assign the column names manually because they vary from text file to text file.

Any help is much obliged, I'm just not really sure where to go from after I read the file using pandas.

Zeugma · Accepted Answer

This is the way I'm mentioning in the comment: it uses a file object to skip the custom dirty data you need to skip at the beginning. You land the file offset at the appropriate location in the file where read_fwf simply does the job:

with open(rawfile, 'r') as data_file:
    while(data_file.read(1)=='#'):
        last_pound_pos = data_file.tell()
        data_file.readline()
    data_file.seek(last_pound_pos)
    df = pd.read_fwf(data_file)

df
Out[88]: 
   i      mult  stat (+/-)  syst (+/-)        Q2         x       x.1       Php
0  0  0.322541    0.018731    0.026681  1.250269  0.037525  0.148981  0.104192
1  1  0.667686    0.023593    0.033163  1.250269  0.037525  0.150414  0.211203
2  2  0.766044    0.022712    0.037836  1.250269  0.037525  0.149641  0.316589
3  3  0.668402    0.024219    0.031938  1.250269  0.037525  0.148027  0.415451
4  4  0.423496    0.020548    0.018001  1.250269  0.037525  0.154227  0.557743
5  5  0.237175    0.023561    0.007481  1.250269  0.037525  0.159904  0.750544

Python: Pandas, dealing with spaced column names

Answers (2)

Related Questions