Reputation: 23
Please I have dataframe with four columns that are [keys, summary, description, and summary_description], so iam dealing with summary_description, trying to apply RegEx and extract the new result in a [New_column], so I have done the looping but Iam not sure what is the problem not working getting error. Please if anyone could help, I would really appreciate it.
import pandas as pd
import re
dataf= pd.read_excel(r'C:\Users\malotaibi\Desktop\Last update\result.xlsx')
dataf
dataf.head(5)
dataf['New_Column'][i] = re.sub('[^A-Za-z0-9]+', ' ', dataf['Summary_Description'][i])
print (dataf['New_column'][i])
Error:
KeyError: 'New_Column'
Upvotes: 1
Views: 59
Reputation: 1680
You can do it like this:
dataf['New_Column'] = dataf['Summary_Description'].str.replace('[^A-Za-z0-9]+', ' ')
Upvotes: 1
Reputation: 1305
You have tried to add to the 'New Column' key before creating it. So do
import pandas as pd
import re
dataf= pd.read_excel(r'C:\Users\malotaibi\Desktop\Last update\result.xlsx')
dataf
dataf.head(5)
dataf['New_Column'] = 1 # this will create the new_column entry and set all its values to 1
You can now loop through this and set each value to what you want. I assume you're going for something like:
for i in range(len(dataf['Summary_Description'])):
dataf['New_Column'][i] = re.sub('[^A-Za-z0-9]+', ' ', dataf['Summary_Description'][i])
Upvotes: 0