Reputation: 199
I have a data frame that every row contains an office location object with several attribute like Global Region
,Primary Function
, and several energy consumption data as numerical values followed. The names of all columns is like below:
['Global Region',
'Primary Function',
'Subsidiaries',
'T&D Loss Rate Category',
'Type',
'Ref',
'Acquisition date',
'Disposal date',
'Corporate Admin Approver',
'Data Providers',
'Initiative administrator',
'Initiative approver',
'Initiative user',
'Invoice owner',
'Apr to Jun 2012',
'Jul to Sep 2012',
'Oct to Dec 2012',
'Jan to Mar 2013',
'Apr to Jun 2013',
'Jul to Sep 2013',
'Oct to Dec 2013',
'Jan to Mar 2014',
'Apr to Jun 2014',
'Jul to Sep 2014',
'Oct to Dec 2014',
'Jan to Mar 2015',
'Apr to Jun 2015',
'Jul to Sep 2015',
'Oct to Dec 2015',
'Jan to Mar 2016']
How can I sort different locations and view the data based on different attributes, e.g. based on primary function
or global region
. I can see the average energy consumption or rank energy intensity of all locations whose primary functio
n is R&D.
I thought of multi index, but I didn't know how to do it. I tried this:
test = xls.parse('Sheet1',index_col=['Lenovo Global Region','Primary Function', 'Subsidiaries', 'Type','Acquisition date','Disposal date','Country'])
It didn't work, error said I could only use numbers not string, so I tried this:
test = xls.parse('Sheet1',index_col=0,1,3,4,5,7,9,10)
Still didn't work. Anyone has good suggestions?
Upvotes: 3
Views: 3185
Reputation: 863256
You can use read_excel
with parameter index_col
which contains list
of positions of necessary columns:
Sample:
df = pd.read_excel('test.xlsx', sheetname='Sheet1', index_col=[0,1,3])
print (df)
Subsidiaries Type Ref
Global Region Primary Function T&D Loss Rate Category
1 1 c a s 10
2 2 c b d 20
3 3 d c d 30
So if add []
it can works:
test = xls.parse('Sheet1',index_col=[0,1,3,4,5,7,9,10])
Upvotes: 2