Reputation: 1999
I created a multiIndex DataFrame by:
df.set_index(['Field1', 'Field2'], inplace=True)
If this is not a multiIndex DataFrame please tell me how to make one.
I want to:
How do I go about doing this?
ADDITIONAL INFO
I have a multiIndex dataFrame that looks like this:
Continent Sector Count
Asia 1 4
2 1
Australia 1 1
Europe 1 1
2 3
3 2
North America 1 1
5 1
South America 5 1
How can I return this as a Series with the index of [Continent, Sector]
Upvotes: 6
Views: 6785
Reputation: 863216
I think you need groupby
with aggregate size
:
df = pd.DataFrame({'Field1':[1,1,1],
'Field2':[4,4,6],
'C':[7,8,9],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
df.set_index(['Field1', 'Field2'], inplace=True)
print (df)
C D E F
Field1 Field2
1 4 7 1 5 7
4 8 3 3 4
6 9 5 6 3
print (df.index)
MultiIndex(levels=[[1], [4, 6]],
labels=[[0, 0, 0], [0, 0, 1]],
names=['Field1', 'Field2'])
print (df.groupby(level=[0,1]).size())
Field1 Field2
1 4 2
6 1
dtype: int64
print (df.groupby(level=['Field1', 'Field2']).size())
Field1 Field2
1 4 2
6 1
dtype: int64
print (df.groupby(level=['Field1', 'Field2']).count())
C D E F
Field1 Field2
1 4 2 2 2 2
6 1 1 1 1
What is the difference between size and count in pandas?
EDIT by comment:
df.set_index(['Continent', 'Sector'], inplace=True)
print (df)
Count
Continent Sector
Asia 1 4
2 1
Australia 1 1
Europe 1 1
2 3
3 2
North America 1 1
5 1
South America 5 1
print (df['Count'])
Continent Sector
Asia 1 4
2 1
Australia 1 1
Europe 1 1
2 3
3 2
North America 1 1
5 1
South America 5 1
Name: Count, dtype: int64
Or:
print (df.squeeze())
Continent Sector
Asia 1 4
2 1
Australia 1 1
Europe 1 1
2 3
3 2
North America 1 1
5 1
South America 5 1
Name: Count, dtype: int64
All together with set_index
:
print (df)
Continent Sector Count
0 Asia 1 4
1 Asia 2 1
2 Australia 1 1
3 Europe 1 1
4 Europe 2 3
5 Europe 3 2
6 North America 1 1
7 North America 5 1
8 South America 5 1
print (df.set_index(['Continent', 'Sector'])['Count'])
Continent Sector
Asia 1 4
2 1
Australia 1 1
Europe 1 1
2 3
3 2
North America 1 1
5 1
South America 5 1
Name: Count, dtype: int64
Upvotes: 5