Reputation: 585
I have a dataframe with the 'Materiais' column that indicates the code of the product, the 'Values' Column, that corresponds to the product values, and 'Month' columns that corresponds to it months.
Year Month Materiais Values
0 2018 M1 52745 NaN
1 2018 M2 52745 NaN
2 2018 M3 52745 NaN
3 2018 M4 52745 NaN
4 2018 M5 52745 NaN
5 2018 M6 52745 NaN
6 2018 M7 52745 NaN
7 2018 M1 58859 NaN
8 2018 M2 58859 NaN
9 2018 M3 58859 NaN
10 2018 M4 58859 NaN
11 2018 M5 58859 300
12 2018 M6 58859 NaN
13 2018 M7 58859 NaN
14 2018 M1 57673 NaN
15 2018 M2 57673 100
16 2018 M3 57673 NaN
17 2018 M4 57673 150
1-) I would like to have in this dataframe, only products that had at least a value in one month.
So my idea is to group all the similar products codes and check if there is at least one value != NaN
.
In order to group, I am using this function:
df = df_demand.groupby(['Materiais'], sort=False, as_index=False)
2-) Filter the months that there is at least one value != NaN'
, using a similar approach:
df = df_demanda.groupby(['Month'], sort=False, as_index=False)
Upvotes: 1
Views: 680
Reputation: 5460
First, you want to get an aggregate that indicates whether that given product has a non-null value (nan's aren't "truthy"):
materiais_any_non_null = df.groupby('Materiais')['Values'].transform('any')
Then just mask your df
with that:
df[materiais_any_non_null]
Result:
Year Month Materiais Values
7 2018 M1 58859 NaN
8 2018 M2 58859 NaN
9 2018 M3 58859 NaN
10 2018 M4 58859 NaN
11 2018 M5 58859 300.0
12 2018 M6 58859 NaN
13 2018 M7 58859 NaN
14 2018 M1 57673 NaN
15 2018 M2 57673 100.0
16 2018 M3 57673 NaN
17 2018 M4 57673 150.0
Upvotes: 1
Reputation: 208
newdf = df.dropna(subset=['Values'])
Upvotes: 0