Reputation: 410
Below is sample of my Dataset;
name status
google Active
Facebook Active
Tex Active
Tex WUP
Yout Active
I am trying to create two DataFrames based on the Count of names(=1 and >1)
Code written :
#single occurance DatFrame
df_single=pd.concat(g for _, g in df.groupby("name") if len(g) == 1)
#Multi Occurance DataFrame
df_multi=pd.concat(g for _, g in df.groupby("name") if len(g) > 1)
The Problem is when i have data like this
name status
google Active
Facebook Active
Tex Active
df_multi=pd.concat(g for _, g in df.groupby("name") if len(g) > 1) fails
This fails saying no data to concat. Can i check if group exist before concat?
Upvotes: 1
Views: 249
Reputation: 863216
I suggest use another solution - GroupBy.transform
for Series
with same size as original DataFrame
, so possible filtering by boolean indexing
:
s = df.groupby("name")['name'].transform('size')
print (s)
0 1
1 1
2 2
3 2
4 1
Name: name, dtype: int64
df_single = df[s == 1]
df_multi = df[s > 1]
If want only filter by duplicates simplier is create boolean mask by Series.duplicated
:
m = df['name'].duplicated(keep=False)
print (m)
0 False
1 False
2 True
3 True
4 False
Name: name, dtype: bool
df_single = df[~m]
df_multi = df[m]
print (df_single)
name status
0 google Active
1 Facebook Active
4 Yout Active
print (df_multi)
name status
2 Tex Active
3 Tex WUP
Upvotes: 2