Reading a variable column multiple header csv file in python

Question

I have a CSV file similar to this example below, and I am having trouble trying to read it into python. This file comes from an x-ray reader. It creates mini-tables each time you use it, and groups them into one CSV.

The problem is that each mini-table can come from 2 different methods, and have different number of headers.

ID Method a1 a2 a3 a4 a6 a7 a8 a9
01 Soil   2  3  4  5  3  6  4  8
02 Soil   6  9  2  7  4  3  4  4
03 Soil   5  2  4  9  1  1  2  3

ID Method a1 a5 a6 a7 
01 Datas   5  4  7  8
02 Datas   4  6  3  8
03 Datas   6  3  2  8

ID Method a1 a5 a6 a7 a8 a9 a10 a11
01 Soil   5  4  7  8  2  1   3   4
02 Soil   4  6  3  8  7  2   2   1

ID Method a1 a5 a6 a7 a8 a9
01 Datas   5  4  7  8  2  1
02 Datas   4  6  3  8  7  2
03 Datas   6  3  2  8  9  9
03 Datas   5  3  2  2  7  7

My goal is to split this messy CSV file into two new CSVs depending on the "method" column, containing only one header. Filling with 0s where necessary. Something similar to this:

ID Method a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11
01 Soil   2  3  4  5  0  3  6  4  8  0   0
02 Soil   6  9  2  7  0  4  3  4  4  0   0
03 Soil   5  2  4  9  0  1  1  2  3  0   0
01 Soil   5  0  0  0  4  7  8  2  1  3   4
02 Soil   4  0  0  0  6  3  8  7  2  2   1

Any help or ideas on how to approach this problem would be very welcome! I am truly lost on how to solve this.

The original is similar to this:

CSV file GDrive

Thanks in advance!

phi · Accepted Answer

We can process the file line by line

import pandas as pd

data = []
with open('pxrf.csv', 'r') as f:
    for s in f.readlines():
        s = s.strip()
        if not s:  # empty line
            header = None
            continue 
        if s.startswith('File #'):  # Header line
            header = s.split(',')
        else:  # Content line
            data.append(dict(zip(header, s.split(','))))
df = pd.DataFrame(data).fillna('0')

You may have to change data types of some columns.

Reading a variable column multiple header csv file in python

Answers (2)

Related Questions