Reputation: 10555
I have a data frame with 3000 companies covering five years.
Id Company Year Value
0 1111111 2016 NaN
1 1111111 2015 3871.0
2 3333333 2016 3989.0
3 3333333 2015 3648.0
4 4444444 2016 5456.0
5 4444444 2015 NaN
6 2222222 2016 NaN
7 2222222 2015 10.0
8 5555555 2016 1515.0
9 5555555 2015 2654.0
I like to make a selection, that makes sure it is all companies that does not have a NaN value. So there is data for all periods in the selection, and thus an equal number of companies per period.
What is the easiest way doing this?
result should be:
Id Company Year Value
2 3333333 2016 3989.0
3 3333333 2015 3648.0
7 5555555 2016 1515.0
8 5555555 2015 2654.0
Thanks
Upvotes: 0
Views: 1103
Reputation:
groupby.count() returns the number of non-null values so if you groupby companies, the count should be equal to the number of years. Assuming no duplicates, you can do this:
df.ix[df.groupby('Company')['Value'].transform('count') > 1, :]
Out[259]:
Id Company Year Value
2 2 3333333 2016 3989.0
3 3 3333333 2015 3648.0
8 8 5555555 2016 1515.0
9 9 5555555 2015 2654.0
Upvotes: 1