Jonnie Marbles
Jonnie Marbles

Reputation: 59

How do I turn column headings into a column

I have a pandas dataframe that looks like this:

Year  A  B  C  D   
1999  1  3  5  7
2000  11 13 17 19
2001  23 29 31 37

And I want it to look like this:

Year  Type  Value
1999  A     1
1999  B     3
1999  C     5
1999  D     7
2000  A     11
2000  B     13

Etc. Is there a way to do this and if so, how?

Upvotes: 1

Views: 38

Answers (2)

jezrael
jezrael

Reputation: 863651

First set_index and then stack, rename_axis and last reset_index:

df = df.set_index('Year').stack().rename_axis(('Year','Type')).reset_index(name='Value')
print (df)
    Year Type  Value
0   1999    A      1
1   1999    B      3
2   1999    C      5
3   1999    D      7
4   2000    A     11
5   2000    B     13
6   2000    C     17
7   2000    D     19
8   2001    A     23
9   2001    B     29
10  2001    C     31
11  2001    D     37

Or use melt, but order of values is different:

df = df.melt('Year', var_name='Type', value_name='Value')
print (df)
    Year Type  Value
0   1999    A      1
1   2000    A     11
2   2001    A     23
3   1999    B      3
4   2000    B     13
5   2001    B     29
6   1999    C      5
7   2000    C     17
8   2001    C     31
9   1999    D      7
10  2000    D     19
11  2001    D     37

... so is necessary sorting:

df = (df.melt('Year', var_name='Type', value_name='Value')
       .sort_values(['Year','Type'])
       .reset_index(drop=True))
print (df)
    Year Type  Value
0   1999    A      1
1   1999    B      3
2   1999    C      5
3   1999    D      7
4   2000    A     11
5   2000    B     13
6   2000    C     17
7   2000    D     19
8   2001    A     23
9   2001    B     29
10  2001    C     31
11  2001    D     37

Numpy solution:

a = np.repeat(df['Year'], len(df.columns.difference(['Year'])))
b = np.tile(df.columns.difference(['Year']), len(df.index))
c = df.drop('Year', 1).values.ravel()

df = pd.DataFrame(np.column_stack([a,b,c]), columns=['Year','Type','Value'])
print (df)
    Year Type Value
0   1999    A     1
1   1999    B     3
2   1999    C     5
3   1999    D     7
4   2000    A    11
5   2000    B    13
6   2000    C    17
7   2000    D    19
8   2001    A    23
9   2001    B    29
10  2001    C    31
11  2001    D    37

Upvotes: 3

BENY
BENY

Reputation: 323386

You can recreate your df

pd.DataFrame({'Year':df.Year.repeat((df.shape[1]-1)),'Type':list(df)[1:]*len(df),'Value':np.concatenate(df.iloc[:,1:].values)})

Out[95]: 
  Type  Value  Year
0    A      1  1999
0    B      3  1999
0    C      5  1999
0    D      7  1999
1    A     11  2000
1    B     13  2000
1    C     17  2000
1    D     19  2000
2    A     23  2001
2    B     29  2001
2    C     31  2001
2    D     37  2001

Upvotes: 3

Related Questions