Reputation: 59
I have a pandas dataframe that looks like this:
Year A B C D
1999 1 3 5 7
2000 11 13 17 19
2001 23 29 31 37
And I want it to look like this:
Year Type Value
1999 A 1
1999 B 3
1999 C 5
1999 D 7
2000 A 11
2000 B 13
Etc. Is there a way to do this and if so, how?
Upvotes: 1
Views: 38
Reputation: 863651
First set_index
and then stack
, rename_axis
and last reset_index
:
df = df.set_index('Year').stack().rename_axis(('Year','Type')).reset_index(name='Value')
print (df)
Year Type Value
0 1999 A 1
1 1999 B 3
2 1999 C 5
3 1999 D 7
4 2000 A 11
5 2000 B 13
6 2000 C 17
7 2000 D 19
8 2001 A 23
9 2001 B 29
10 2001 C 31
11 2001 D 37
Or use melt
, but order of values is different:
df = df.melt('Year', var_name='Type', value_name='Value')
print (df)
Year Type Value
0 1999 A 1
1 2000 A 11
2 2001 A 23
3 1999 B 3
4 2000 B 13
5 2001 B 29
6 1999 C 5
7 2000 C 17
8 2001 C 31
9 1999 D 7
10 2000 D 19
11 2001 D 37
... so is necessary sorting:
df = (df.melt('Year', var_name='Type', value_name='Value')
.sort_values(['Year','Type'])
.reset_index(drop=True))
print (df)
Year Type Value
0 1999 A 1
1 1999 B 3
2 1999 C 5
3 1999 D 7
4 2000 A 11
5 2000 B 13
6 2000 C 17
7 2000 D 19
8 2001 A 23
9 2001 B 29
10 2001 C 31
11 2001 D 37
Numpy solution:
a = np.repeat(df['Year'], len(df.columns.difference(['Year'])))
b = np.tile(df.columns.difference(['Year']), len(df.index))
c = df.drop('Year', 1).values.ravel()
df = pd.DataFrame(np.column_stack([a,b,c]), columns=['Year','Type','Value'])
print (df)
Year Type Value
0 1999 A 1
1 1999 B 3
2 1999 C 5
3 1999 D 7
4 2000 A 11
5 2000 B 13
6 2000 C 17
7 2000 D 19
8 2001 A 23
9 2001 B 29
10 2001 C 31
11 2001 D 37
Upvotes: 3
Reputation: 323386
You can recreate your df
pd.DataFrame({'Year':df.Year.repeat((df.shape[1]-1)),'Type':list(df)[1:]*len(df),'Value':np.concatenate(df.iloc[:,1:].values)})
Out[95]:
Type Value Year
0 A 1 1999
0 B 3 1999
0 C 5 1999
0 D 7 1999
1 A 11 2000
1 B 13 2000
1 C 17 2000
1 D 19 2000
2 A 23 2001
2 B 29 2001
2 C 31 2001
2 D 37 2001
Upvotes: 3