Counting Words from one columns of Dataframe to Another Dataframe column

Question

I am having dataframe idf as below.

       feature_name idf_weights
2488    kralendijk  11.221923
3059    night       0
1383    ebebf       0

I have another Dataframe df

     message                   Number of Words in each message  
0   night kralendijk ebebf          3

I want to add idf weights from idf for each word in the "df" dataframe in a new column.

The output will look like the below:

    message                   Number of Words in each message   Number of words with idf_score>0
0   night kralendijk ebebf                 3                     1

Here is what I've tried so far, but it's giving the total count of words instead of word having idf_weight>0:

words_weights = dict(idf[['feature_name', 'idf_weights']].values)
df['> zero'] = df['message'].apply(lambda x: count([words_weights.get(word, 11.221923) for word in x.split()]))

Output

     message                   Number of Words in each message   Number of words with idf_score>0
0   night kralendijk ebebf                 3                     3

Thank you.

mozway · Accepted Answer

Try using a list comprehension:

# set up a dictionary for easy feature->weight indexing
d = idf.set_index('feature_name')['idf_weights'].to_dict()
# {'kralendijk': 11.221923, 'night': 0.0, 'ebebf': 0.0}

df['> zero'] = [sum(d.get(w, 0)>0 for w in x.split()) for x in df['message']]

## OR, slighlty faster alternative
# df['> zero'] = [sum(1 for w in x.split() if d.get(w, 0)>0) for x in df['message']]

output:

                  message  Number of Words in each message  > zero
0  night kralendijk ebebf                                3       1

Counting Words from one columns of Dataframe to Another Dataframe column

Answers (2)

Related Questions