AgentArachnid
AgentArachnid

Reputation: 43

Value counts by group in pandas

I'm new to pandas and I want to be able to get number of instances for each person and feed it into a another Dataframe as a column. I've removed the NaN values from the dataframe before I made the group by the user column

I've tried this but it doesn't seem to work

DF["NumInstances"] = userGrp["user"].value_counts()

I've look over the internet, but can't seem to find a solution, please help.

Edit: Sample Data and Expected Outcome

[{"user" : "4",
"Instance": "21"},
 {"user" : "4",
"Instance": "6"},
{"user" : "5",
"Instance" : "546453"}]

Expected outcome:

DataFrame =

[{"user":"4",
 "NumInstances" : "2"},
 {"user":"5",
 "NumInstances" : "1"}]

So basically counts how many times the instance occurs for each user across data entries.

Upvotes: 2

Views: 353

Answers (3)

Mayank Porwal
Mayank Porwal

Reputation: 34056

Based on your sample input, you can do this:

In [2535]: df = pd.DataFrame([{"user" : "4", 
      ...: "Instance": "21"}, 
      ...:  {"user" : "4", 
      ...: "Instance": "6"}, 
      ...: {"user" : "5", 
      ...: "Instance" : "546453"}])  

In [2539]: df.groupby('user', as_index=False).count()
Out[2539]: 
  user  Instance
0    4         2
1    5         1

Upvotes: 2

Flo
Flo

Reputation: 986

I used the following solution that will create a new dataframe which contains both column named "user" and "NumInstances" :

df_counts = df.groupby(['user']).size().reset_index(name='NumInstances')

Hope it helps.

Upvotes: 0

Filippo Sebastio
Filippo Sebastio

Reputation: 1112

if DF is the name of your dataset and "user" the name of the column you want to groupby for, then try:

count = DF.groupby("user").count()

print(count)

Upvotes: 0

Related Questions