TypeError: expected string or bytes-like object Regular expression removing special characters

Question

train dataframe with content column. content column has list for each row containing different words in that list.

content
[sure, tune, …, watch, donald, trump, “,”, late, ’ , night]
[abc, xyz, “,”,late, ’, night]

Code to remove regular expressions

import re
train['content'] = train['content'].map(lambda x: re.sub(r'\W+', '', x))

Error

TypeError: expected string or bytes-like object

Expected output

content
[sure, tune,  watch, donald, trump, late,   night]
[abc, xyz,late, night]

Notice all the special characters like ..., “, ” and ’ are gone and we are left only with words.

ztepler · Accepted Answer

You are trying to apply regular expression to the List object.

If your goal is to use this regex on every item of the list, you can apply re.sub for each item in list:

import re
def replace_func(item):
    return re.sub(r'\W+', '', item)

train['content'] = train['content'].map(lambda x: [replace_func(item) for item in x])

TypeError: expected string or bytes-like object Regular expression removing special characters

Answers (2)

Related Questions