dj2560
dj2560

Reputation: 75

Update values in new column

I want to run a package(RAKE) to extract keyphrases from comments(df['CUSTOMER_RECOMMENDATIONS_TRANS]) and create a new column(df['keyphrase_RAKE']) to store them corresponding to each comment. I'm getting an error saying "ValueError: Length of values does not match the length of index". I know the reason behind the error but don't know how to fix it. What can be done?

keywords return a list of keyphrases.

This the code:

import RAKE
import operator

# Reka setup with stopword directory
stop_dir = "SmartStoplist.txt"
rake_object = RAKE.Rake(stop_dir)

# Sample text to test RAKE
df = pd.read_excel('my.xlsx')

for i in df['CUSTOMER_RECOMMENDATIONS_TRANS']:
    keywords = rake_object.run(i)
    df['keyphrase_RAKE'] = keywords

Upvotes: 0

Views: 61

Answers (1)

Shijith
Shijith

Reputation: 4882

you can usepandas.DataFrame.apply and avoid the for loop

df['keyphrase_RAKE'] = df['CUSTOMER_RECOMMENDATIONS_TRANS'].apply(rake_object.run)

Upvotes: 2

Related Questions