Reputation: 679
I don't know how can i create a dataframe based on another dataframe using a groupby conditions. For example, i have a dataframe that if i apply the function:
flights_df.groupby(by='DepHour')['Cancelled'].value_counts()
I obtain something like this
DepHour Cancelled
0.0 0 20361
1 7
1.0 0 5857
1 4
2.0 0 1850
1 1
**3.0 0 833**
4.0 0 3389
1 1
5.0 0 148143
1 24
As can be seen, for DepHour == 3.0
there's no cancelled flights.
Using the same dataframe that i used to generate this output i want to create another dataframe containing only of values where there's no cancelled flighs for DepHour. In this case, the output will be a dataframe containing only values of DepHour == 3.0
.
I know that i can use mask, but it allows only filter values where cancelled == 0
(i.e. all other values for where DepHour cancelled == 0
are included).
Thanks and sorry for my bad english!
Upvotes: 0
Views: 135
Reputation: 1614
There might be a cleaner way (probably without using groupby
twice) but this should should work:
flights_df.groupby('DepHour') \
.filter(lambda x: (x['Cancelled'].unique()==[0]).all()) \
.groupby('DepHour')['Cancelled'].value_counts()
Upvotes: 1