Reputation: 608
Is it possible to find count of nulls in last 5 records of a column of a group object in pandas and save in a new column. Lets say we have a group (00936) in a dataframe with 10 records as shown below. I don't want to count the total nulls in the column 'Obse' but only the nulls in last 5 months. And this count (i.e. 4.0) i want to fill in a new column.
code months Obse
2903 00936 2019-02 4.0
2904 00936 2019-03 2.0
2905 00936 2019-04 1.0
2906 00936 2019-05 NaN
2907 00936 2019-06 NaN
2908 00936 2019-07 2.0
2909 00936 2019-08 NaN
2910 00936 2019-09 NaN
2911 00936 2019-10 NaN
I am trying to use tail function but it won't work because it returns less records than length of group.
df['count_nulls'] = df.groupby('code').tail(5).Obse.transform(lambda group: group.isnull().sum())
How can I achieve this. below is expected output for above input dataframe
code months Obse count_nulls
2903 00936 2019-02 4.0 4.0
2904 00936 2019-03 NaN 4.0
2905 00936 2019-04 1.0 4.0
2906 00936 2019-05 NaN 4.0
2907 00936 2019-06 NaN 4.0
2908 00936 2019-07 2.0 4.0
2909 00936 2019-08 NaN 4.0
2910 00936 2019-09 NaN 4.0
2911 00936 2019-10 NaN 4.0
Upvotes: 2
Views: 1202
Reputation: 323266
How about
df['New']=df.code.map((5-df.groupby('code').tail(5).groupby('code')['Obse'].count()))
df
Out[152]:
code months Obse New
2903 936 2019-02 4.0 4
2904 936 2019-03 2.0 4
2905 936 2019-04 1.0 4
2906 936 2019-05 NaN 4
2907 936 2019-06 NaN 4
2908 936 2019-07 2.0 4
2909 936 2019-08 NaN 4
2910 936 2019-09 NaN 4
2911 936 2019-10 NaN 4
Upvotes: 3
Reputation: 150745
How about moving tail
inside the lambda function:
df['count_nulls'] = (df.groupby(['code'])['Obse']
.transform(lambda x: x.tail(5).isna().sum())
)
Output:
code months Obse count_nulls
2903 936 2019-02 4.0 4.0
2904 936 2019-03 2.0 4.0
2905 936 2019-04 1.0 4.0
2906 936 2019-05 NaN 4.0
2907 936 2019-06 NaN 4.0
2908 936 2019-07 2.0 4.0
2909 936 2019-08 NaN 4.0
2910 936 2019-09 NaN 4.0
2911 936 2019-10 NaN 4.0
Upvotes: 5