Reputation: 380
I have the below table in excel (xls):
Quarter Q1 Q1 Q1
Year 2013 2014 2015
Company A A A
PA 500 300 963
NY 200 150 1500
What I would like to create is the below table by using Python Pandas:
Quarter Year State Company Amount
Q1 2013 PA A 500
Q1 2014 PA A 300
Q1 2015 PA A 963
Q1 2013 NY A 200
Q1 2014 NY A 150
Q1 2015 NY A 1500
Edit: I have used the below code per @jezrael suggestion:
import pandas as p
df = p.read_excel('test.xls')
df = df.set_index('0')
df = df.T.stack().reset_index(name='idx')
df.columns = ['Quarter', 'Year', 'Company', 'State', 'Amount']
df = df[['Quarter', 'Year', 'State', 'Company', 'Amount']]
df = df.sort_values(by=['State'], ascending=['False']).reset_index(drop=True)
print(df)
I kept getting the below error:
File "pandas\hashtable.pyx", line 676, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12216)
KeyError: '0'
I have also tried with 0, the same error with KeyError: 0
Upvotes: 1
Views: 138
Reputation: 862731
Very close what you want:
df = pd.read_excel('test.xls', header=None, index_col=0)
print df
# 1 2 3
#0
#Quarter Q1 Q1 Q1
#Year 2013 2014 2015
#Company A A A
#PA 500 300 963
#NY 200 150 1500
#set indexes
df = df.T.set_index(['Quarter', 'Year', 'Company'])
#stack and reset index
df = df.stack().reset_index(name='idx')
#rename and reorder columns
df.columns = ['Quarter', 'Year', 'Company', 'State', 'Amount']
df = df[['Quarter', 'Year', 'State', 'Company', 'Amount']]
#sort by column State
df = df.sort_values(by=['State'], ascending=['False']).reset_index(drop=True)
print df
# Quarter Year State Company Amount
#0 Q1 2013 NY A 200
#1 Q1 2014 NY A 150
#2 Q1 2015 NY A 1500
#3 Q1 2013 PA A 500
#4 Q1 2014 PA A 300
#5 Q1 2015 PA A 963
Upvotes: 1