Reputation: 498
Let's say I have data like this:
df = pd.DataFrame({'category': ["blue","red","blue", "blue","green"], 'val1': [5, 3, 2, 2, 5], 'val2':[1, 3, 2, 2, 5], 'val3': [2, 1, 1, 4, 3]})
print(df)
category val1 val2 val3
0 blue 5 1 2
1 red 3 3 1
2 blue 2 2 1
3 blue 2 2 4
4 green 5 5 3
How do I remove (or replace with for example NaN) values that meet a certain condition without removing the entire row or shift the column?
Let's say my condition is that I want to remove all values below 3 from the above data, the result would have to look like this:
category val1 val2 val3
0 blue 5
1 red 3 3
2 blue
3 blue 4
4 green 5 5 3
Upvotes: 1
Views: 688
Reputation: 61900
Use mask:
df.iloc[:, 1:] = df.iloc[:, 1:].mask(df.iloc[:, 1:] < 3)
print(df)
Output
category val1 val2 val3
0 blue 5.0 NaN NaN
1 red 3.0 3.0 NaN
2 blue NaN NaN NaN
3 blue NaN NaN 4.0
4 green 5.0 5.0 3.0
If you want to set particular value, for example 0
, do:
df.iloc[:, 1:] = df.iloc[:, 1:].mask(df.iloc[:, 1:] < 3, 0)
print(df)
Output
category val1 val2 val3
0 blue 5 0 0
1 red 3 3 0
2 blue 0 0 0
3 blue 0 0 4
4 green 5 5 3
If you just need a few columns, you could do:
df[['val1', 'val2', 'val3']] = df[['val1', 'val2', 'val3']].mask(df[['val1', 'val2', 'val3']] < 3)
print(df)
Output
category val1 val2 val3
0 blue 5.0 NaN NaN
1 red 3.0 3.0 NaN
2 blue NaN NaN NaN
3 blue NaN NaN 4.0
4 green 5.0 5.0 3.0
Upvotes: 4
Reputation: 1875
You can use applymap or transform to columns containing integers.
df[df.iloc[:,1:].transform(lambda x: x>=3)].fillna('')
Upvotes: 0
Reputation: 1749
One approach is to create a mask of the values that don't meet the removal criteria.
mask = df[['val1','val2','val3']] > 3
You can then create a new df, that is just the non-removed vals.
updated_df = df[['val1','val2','val3']][mask]
You need to add back in the unaffected columns.
updated_df['category'] = df['category']
Upvotes: 0