ohmmega
ohmmega

Reputation: 55

Restructuring CSV into Pandas DataFrame

I've got a CSV with a rather messy format:

t, 01_x, 01_y, 02_x, 02_y
0,     0,    1,    ,
1,     1,    1,   0,    0  

Thereby "01_" and "02_" are numbers of entities (1, 2), which can vary from file to file and there might be additional columns too (but at least the same for all entities). Note also that entity 2 enters the scene at t=1 (no entries at t=0).

I already import the CSV into a pandas dataframe, but don't see the way to transform the stuff into the following form:

t, entity, x, y
0,      1, 0, 1
1,      1, 1, 1
1       2, 0, 0

Is there a simple (pythonic) way to transform that?

Thanks! René

Upvotes: 1

Views: 104

Answers (1)

ALollz
ALollz

Reputation: 59519

This is wide_to_long, but we need to first swap the order of your column names around the '_'

df.columns = ['_'.join(x.split('_')[::-1]) for x in df.columns]
#Index(['t', 'x_01', 'y_01', 'x_02', 'y_02'], dtype='object')

(pd.wide_to_long(df, i='t', j='entity', stubnames=['x', 'y'], sep='_')
   .dropna()
   .reset_index())

   t  entity    x    y
0  0       1  0.0  1.0
1  1       1  1.0  1.0
2  1       2  0.0  0.0

Upvotes: 1

Related Questions