shashank
shashank

Reputation: 410

Check if group contains element before forming it in Pandas Python

Below is sample of my Dataset;

name       status
google    Active
Facebook  Active
Tex       Active
Tex       WUP
Yout      Active

I am trying to create two DataFrames based on the Count of names(=1 and >1)

Code written :

#single occurance DatFrame
df_single=pd.concat(g for _, g in df.groupby("name") if len(g) == 1)
#Multi Occurance DataFrame
df_multi=pd.concat(g for _, g in df.groupby("name") if len(g) > 1)

The Problem is when i have data like this

name       status
google    Active
Facebook  Active
Tex       Active

df_multi=pd.concat(g for _, g in df.groupby("name") if len(g) > 1) fails

This fails saying no data to concat. Can i check if group exist before concat?

Upvotes: 1

Views: 249

Answers (1)

jezrael
jezrael

Reputation: 863216

I suggest use another solution - GroupBy.transform for Series with same size as original DataFrame, so possible filtering by boolean indexing:

s = df.groupby("name")['name'].transform('size')
print (s)
0    1
1    1
2    2
3    2
4    1
Name: name, dtype: int64

df_single = df[s == 1]
df_multi = df[s > 1]

If want only filter by duplicates simplier is create boolean mask by Series.duplicated:

m = df['name'].duplicated(keep=False)
print (m)
0    False
1    False
2     True
3     True
4    False
Name: name, dtype: bool

df_single = df[~m]
df_multi = df[m]

print (df_single)
       name  status
0    google  Active
1  Facebook  Active
4      Yout  Active

print (df_multi)

  name  status
2  Tex  Active
3  Tex     WUP

Upvotes: 2

Related Questions