Reputation: 922
I'm somewhat new to pandas.
I have a data frame, words_df
with two columns. The first column ([[0]]
) is a list of words, the second are values that I want to use in processes further downstream.
Looks something like this:
a:State-word occurrences
0 FIRE 1535
1 BRR 1189
2 GREEN 521
3 ORANGE 504
4 PURPLE 503
5 BLUE 482
6 VIOLET 480
7 YELLOW 445
8 INDIGO 434
9 BLACK 392
10 WHITE 381
11 PINK 322
...
I have a second list of words that I want to filter against.
If a word is in my filter_list
, remove the row in my words_df
.
So far I've managed:
filter_list = words_df[[0]].isin(filter_list)
which I then attempt to do this with:
words_df[~filter_list]
It sort of works, but mostly doesn't. looks like this on the other end:
a:State-word occurrences
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 PURPLE NaN
5 NaN NaN
6 NaN NaN
7 YELLOW NaN
8 INDIGO NaN
9 NaN NaN
10 NaN NaN
11 NaN NaN
I'd like it to look like this:
a:State-word occurrences
1 PURPLE 503
2 YELLOW 445
3 INDIGO 434
What am I doing wrong?
Upvotes: 1
Views: 591
Reputation: 21878
You were close to the answer
# Test data
df = DataFrame({'a:State-word': ['FIRE','BRR', 'GREEN', 'ORANGE', 'PURPLE', 'BLUE', 'VIOLET'],
'occurrences': [1535, 1189, 521, 504, 503, 482, 480]})
filter_list = ['FIRE', 'BRR', 'GREEN', 'ORANGE', 'BLUE']
df
# a:State-word occurrences
# 0 FIRE 1535
# 1 BRR 1189
# 2 GREEN 521
# 3 ORANGE 504
# 4 PURPLE 503
# 5 BLUE 482
# 6 VIOLET 480
df[~df['a:State-word'].isin(filter_list)]
# a:State-word occurrences
# 4 PURPLE 503
# 6 VIOLET 480
Upvotes: 2