Happytreat
Happytreat

Reputation: 139

Manipulate a new column after 'df.size()' function?

After I have executed a df.size() function as seen below (df = Dataframe) in the pandas module, I've obtained a new column beside the one labeled No.. However, I'm not sure how to manipulate this new column. This is because I don't know the label/key for this column.

For example, I want to express the values generated (in the new column) as a fraction of the sum of all these values in a new column. How can I do so?

JuncNo = pd.read_csv(filename)
JuncNo_group = JuncNo.groupby('No.')
JuncSize = JuncNo_group.size()
JuncSize.head(n=6)
No.
1   122
2  2136 
3   561
4    91
5    10
6     3
dtype: int64

Upvotes: 1

Views: 73

Answers (1)

jezrael
jezrael

Reputation: 863361

You have to set name of new Series and reset index:

JuncSize = JuncSize.groupby('No').size()
JuncSize.name = 'size'
JuncSize = JuncSize.reset_index()
print JuncSize

But if you need add new column with same no of rows as original dataframe, you can use:

JuncSize['size'] = JuncSize.groupby('No').transform(np.size)

Example:

print JuncSize
  No Code
0  D   B2
1  B   B2
2  B   B3
3  B   B3
4  G   B3
5  B   B3

JuncSize['size'] = JuncSize.groupby('No').transform(np.size)
print JuncSize
  No Code size
0  D   B2    1
1  B   B2    4
2  B   B3    4
3  B   B3    4
4  G   B3    1
5  B   B3    4
JuncSize = JuncSize.groupby('No').size()
print JuncSize
No
B    4
D    1
G    1
JuncSize.name = 'size'
print JuncSize
No
B    4
D    1
G    1
Name: size, dtype: int64

JuncSize = JuncSize.reset_index()
print JuncSize
  No  size
0  B     4
1  D     1
2  G     1

Upvotes: 1

Related Questions