Reputation: 139
After I have executed a df.size()
function as seen below (df = Dataframe
) in the pandas module, I've obtained a new column beside the one labeled No.
. However, I'm not sure how to manipulate this new column. This is because I don't know the label/key for this column.
For example, I want to express the values generated (in the new column) as a fraction of the sum of all these values in a new column. How can I do so?
JuncNo = pd.read_csv(filename)
JuncNo_group = JuncNo.groupby('No.')
JuncSize = JuncNo_group.size()
JuncSize.head(n=6)
No.
1 122
2 2136
3 561
4 91
5 10
6 3
dtype: int64
Upvotes: 1
Views: 73
Reputation: 863361
You have to set name of new Series and reset index:
JuncSize = JuncSize.groupby('No').size()
JuncSize.name = 'size'
JuncSize = JuncSize.reset_index()
print JuncSize
But if you need add new column with same no of rows as original dataframe, you can use:
JuncSize['size'] = JuncSize.groupby('No').transform(np.size)
Example:
print JuncSize
No Code
0 D B2
1 B B2
2 B B3
3 B B3
4 G B3
5 B B3
JuncSize['size'] = JuncSize.groupby('No').transform(np.size)
print JuncSize
No Code size
0 D B2 1
1 B B2 4
2 B B3 4
3 B B3 4
4 G B3 1
5 B B3 4
JuncSize = JuncSize.groupby('No').size()
print JuncSize
No
B 4
D 1
G 1
JuncSize.name = 'size'
print JuncSize
No
B 4
D 1
G 1
Name: size, dtype: int64
JuncSize = JuncSize.reset_index()
print JuncSize
No size
0 B 4
1 D 1
2 G 1
Upvotes: 1