Reputation: 209
I have a dataframe with the following structure:
id timestamp_str y x
2569 Jan.2018 4 9
2569 Feb.2018 6 5
2569 Mar.2018 3 9
2569 Apr.2018 2 10
2569 May.2018 6 9
3679 Jan.2018 14 52
3679 Feb.2018 62 4
3679 Mar.2018 5 89
3679 Apr.2018 9 4
3679 May.2018 32 9
.
.
.
and I would like to have this data into the following format:
id parameter Jan.2018 Feb.2018 Mar.2018 Apr.2018 May.2018
2569 y 4 6 3 2 6
2569 x 9 5 9 10 9
3679 y 14 62 5 9 32
3679 x 52 4 89 4 9
But so far I was not able to change the format, I tried something like this
df = (df.set_index(['id',df.groupby(['id']).cumcount().add(1)])
.unstack()
.sort_index(axis=1, level=[1, 0], ascending=[True, False]))
df.columns = df.columns.map(lambda x: f'{x[0]}{x[1]}')
df = df.reset_index()
Thanks for any hints,
cheers!
Upvotes: 1
Views: 36
Reputation: 863226
If need solutions without convert to datetimes:
Use DataFrame.stack
with Series.unstack
for reshape, set original ordering of datetimes in DataFrame.reindex
:
df = (df.set_index(['id_x','timestamp_str'])
.stack()
.unstack(1)
.reindex(df['timestamp_str'].unique(),axis=1)
.rename_axis(index=['id','parameter'], columns=None)
.reset_index())
print (df)
id parameter Jan.2018 Feb.2018 Mar.2018 Apr.2018 May.2018
0 2569 y 4 6 3 2 6
1 2569 x 9 5 9 10 9
2 3679 y 14 62 5 9 32
3 3679 x 52 4 89 4 9
Or like mentioned @sammywemmy DataFrame.melt
with DataFrame.pivot
:
df = (df.melt(['id_x', 'timestamp_str'])
.pivot(['id_x', 'variable'], 'timestamp_str', 'value')
.reindex(df['timestamp_str'].unique(),axis=1)
.rename_axis(index=['id','parameter'], columns=None)
.reset_index())
print (df)
id parameter Jan.2018 Feb.2018 Mar.2018 Apr.2018 May.2018
0 2569 x 9 5 9 10 9
1 2569 y 4 6 3 2 6
2 3679 x 52 4 89 4 9
3 3679 y 14 62 5 9 32
Solutions with convert to datetimes - first convert column timestamp_str
to datetimes, apply solution without reindex
and add rename
for original format of datetimes
:
df['timestamp_str'] = pd.to_datetime(df['timestamp_str'], format='%b.%Y')
df = (df.set_index(['id_x','timestamp_str'])
.stack()
.unstack(1)
.rename_axis(index=['id','parameter'], columns=None)
.rename(columns= lambda x: x.strftime('%b.%Y'))
.reset_index())
df['timestamp_str'] = pd.to_datetime(df['timestamp_str'], format='%b.%Y')
df = (df.melt(['id_x', 'timestamp_str'])
.pivot(['id_x', 'variable'], 'timestamp_str', 'value')
.rename_axis(index=['id','parameter'], columns=None)
.rename(columns= lambda x: x.strftime('%b.%Y'))
.reset_index())
Upvotes: 1