Reputation: 363
I have a dataframe that looks like this :
id points
a [c,v,b,n]
b []
c [x,a]
....
and a dictionary (i also have it as dataframe):
{'a': ['j','c'],
'b': [p,r,q]
'c': [n,k,l,x,a]
....}
I want to search if the key of the dictionary is contained is the points of the dataframe and then remove the items from dictionary points that there is no match in the dictionary.Expected output:
id points
a [c]
b []
c [x,a]
I tried this
for key,point in my_dict.items():
if df['points'].str.contains(point).any()
but i get TypeError: unhashable type: 'list'
I tried converting the dataframe to a dictionary but then the search time is too much because i need more for loops. Any suggestions for code or data structure improvements?
Edit
Another representation of the data :
id points
a [c,v,b,n]
b []
c [x,a]
....
and
points
j,c
p,r,q
n,k,l,x,a
Upvotes: 1
Views: 215
Reputation: 394179
You can call apply
and convert your dict values into a set can convert the intersection
to a list:
In [15]:
d={'a': ['j','c'],
'b': ['p','r','q'],
'c': ['n','k','l','x','a']}
d
Out[15]:
{'a': ['j', 'c'], 'b': ['p', 'r', 'q'], 'c': ['n', 'k', 'l', 'x', 'a']}
In [17]:
df['points'] = df.apply(lambda row: list(set(d[row['id']]).intersection(row['points'])), axis=1)
df
Out[17]:
id points
0 a [c]
1 b []
2 c [a, x]
As to why you get an error, you're trying to call a .str
method on a Series that is a dtype list, they are not strings.
Upvotes: 1