Reputation: 55
I've got a CSV with a rather messy format:
t, 01_x, 01_y, 02_x, 02_y
0, 0, 1, ,
1, 1, 1, 0, 0
Thereby "01_" and "02_" are numbers of entities (1, 2), which can vary from file to file and there might be additional columns too (but at least the same for all entities). Note also that entity 2 enters the scene at t=1 (no entries at t=0).
I already import the CSV into a pandas dataframe, but don't see the way to transform the stuff into the following form:
t, entity, x, y
0, 1, 0, 1
1, 1, 1, 1
1 2, 0, 0
Is there a simple (pythonic) way to transform that?
Thanks! René
Upvotes: 1
Views: 104
Reputation: 59519
This is wide_to_long
, but we need to first swap the order of your column names around the '_'
df.columns = ['_'.join(x.split('_')[::-1]) for x in df.columns]
#Index(['t', 'x_01', 'y_01', 'x_02', 'y_02'], dtype='object')
(pd.wide_to_long(df, i='t', j='entity', stubnames=['x', 'y'], sep='_')
.dropna()
.reset_index())
t entity x y
0 0 1 0.0 1.0
1 1 1 1.0 1.0
2 1 2 0.0 0.0
Upvotes: 1