Reputation: 75
I want to run a package(RAKE) to extract keyphrases from comments(df['CUSTOMER_RECOMMENDATIONS_TRANS]) and create a new column(df['keyphrase_RAKE']) to store them corresponding to each comment. I'm getting an error saying "ValueError: Length of values does not match the length of index". I know the reason behind the error but don't know how to fix it. What can be done?
keywords return a list of keyphrases.
import RAKE
import operator
# Reka setup with stopword directory
stop_dir = "SmartStoplist.txt"
rake_object = RAKE.Rake(stop_dir)
# Sample text to test RAKE
df = pd.read_excel('my.xlsx')
for i in df['CUSTOMER_RECOMMENDATIONS_TRANS']:
keywords = rake_object.run(i)
df['keyphrase_RAKE'] = keywords
Upvotes: 0
Views: 61
Reputation: 4882
you can usepandas.DataFrame.apply
and avoid the for loop
df['keyphrase_RAKE'] = df['CUSTOMER_RECOMMENDATIONS_TRANS'].apply(rake_object.run)
Upvotes: 2