SAPNONEXPERT
SAPNONEXPERT

Reputation: 59

Why does Python tell me **TypeError: unhashable type: 'list'** when I have a dataframe?

I have the following dataframe and a similar second one which I want to compare. The problem is that I think I confuse datatypes:

df1 = pd.DataFrame(pd.read_csv("csv", delimiter=';', header=None, skiprows=1, names=['1', '2']))
df['1'].str.replace(r'[^\w\s]+', '')
df['1'] = df1['1'].str.replace('\d+', '')
df = df.apply(nltk.word_tokenize)
df = [nltk.word_tokenize(str(1)) for 1in df]
df = df.apply(lambda x: [item.lower() for item in x if item.lower() not in stop_words])
df = set(df)

TypeError: unhashable type: 'list'

Upvotes: 0

Views: 135

Answers (1)

mcsoini
mcsoini

Reputation: 6642

On your second to last line you are generating a Series of lists. Then you are converting that series to a set. You can't do that, because the elements of a set need to be hashable, and lists are not (as it says in the TypeError). In contrast to lists, tuples are hashable. Assuming that the rest of your code works (I have no way of checking), try

df = df.apply(lambda x: tuple(item.lower() for item in x if item.lower() not in stop_words))
df = set(df)

Upvotes: 1

Related Questions