Get values from specific columns in different csv files, concat. and create column based on filename in python

Question

I have multiple (more than 100) files like this:

filename: 00.csv

residue, vwd, total  
AAA,0.00, -9.45  
BBB, 0.45, -1.45  
CCC, 0.44, -3    
DDD, 0.1, -10

filename: 01.csv

residue, vwd, total  
AAA, 2, -0.56  
BBB, -4, -9.32  
CCC, 2.54, -10  
DDD, 3, -6.4

...

I would like to create a matrix in a new csv file where the first column is "residue", and the others are based on filename (without extension). Below filename, it should be values from "total" column. It would be like this:

residue, 00, 01, ...      
AAA, -9.45, -0,56, ...  
BBB, -1.45, -9.32, ...  
CCC, -3, -10,...  
DDD,  -10, -6.4, ...

. . .

Thanks in advance!

user17242583 · Accepted Answer

This will work:

files = ['00.csv', '01.csv']

dfs = []
for file in files:
    df = pd.read_csv(file)
    df.columns = df.columns.str.strip()
    df = df[['residue', 'total']].rename({'total': os.path.splitext(file)[0]}, axis=1)
    dfs.append(df)

df = cols[0]
for sub_df in cols[1:]:
    df = df.merge(sub_df, on='residue')

Output:

>>> df
  residue     00     01
0     AAA  -9.45  -0.56
1     BBB  -1.45  -9.32
2     CCC  -3.00 -10.00
3     DDD -10.00  -6.40

Get values from specific columns in different csv files, concat. and create column based on filename in python

Answers (2)

Related Questions