Reputation: 1683
I am having trouble combining multiple values from 10 columns into one set
. I wanted to use a set because each column has repeated values and I am looking to get a list of all of the values (medical codes) without repeating any of them in the list. I was able to make an initial set out of the first column but when I try to add other columns I get an "unhashable type error".
Here is my code:
data_sorted = data.fillna(0).sort_values(['PAT_ID', 'VISIT_NO'])
set_ICD1 = set(data_sorted['ICD_1'].unique())
print(len(set_ICD1))
set_ICD = set_ICD1.add(data_sorted['ICD_2'])
print(len(set_ICD))
here is the error I get with this:
11586 # (not part of the error this is the length of the initial set)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-e3966ec54661> in <module>()
1 set_ICD1 = set(data_sorted['ICD_1'].unique())
2 print(len(set_ICD1))
----> 3 set_ICD = set_ICD1.add(data_sorted['ICD_2'].unique())
4
5 print(len(set_ICD))
TypeError: unhashable type: 'numpy.ndarray'
Any advice or tips how to fix this would be greatly appreciated!
Upvotes: 1
Views: 1318
Reputation: 152607
If you want to add multiple elements to a set
at once you need to use the update
method instead of add
:
set_ICD1.update(data_sorted['ICD_2'])
In case it's a NumPy array you should probably use ravel()
(in case it's n-dimensional - this will flatten it) and tolist()
(for performance):
set_ICD1.update(data_sorted['ICD_2'].ravel().tolist())
Upvotes: 3