Reputation: 93
I have two two Dataframes and one is made up of list.
in [00]: table01
out[00]:
a b
0 1 2
1 2 3
in [01]: table02
out[01]:
a b
0 [2] [3]
1 [1,2] [1,2]
And now I want to compare two tables. If the element in table01 also in the same position list of table02, return True otherwise return False. So the table I want to have is:
a b
0 False False
1 True False
I have tried table01 in table02 but get a error message: 'DataFrame' objects are mutable, thus they cannot be hashed.
Please share the correct solution of this problem with me. Thanks a lot!
Upvotes: 0
Views: 2295
Reputation: 323226
Try this
df=pd.melt(df1.reset_index(),'index')
df['v2']=pd.melt(df2.reset_index(),'index').value
pd.melt(df2.reset_index(),'index')
df['BOOL']=df.apply(lambda x: True if x.value in x.v2 else False, axis = 1)
df.pivot('index','variable','BOOL')
Out[491]:
variable a b
index
0 False False
1 True False
Finally :
df1.apply(lambda x: [(x==df2.loc[y,x.name])[y] for y in x.index])
Out[668]:
a b
0 False False
1 True False
Upvotes: 1
Reputation: 402253
Using sets
and df.applymap
:
df3 = df1.applymap(lambda x: {x})
df4 = df2.applymap(set)
df3 & df4
a b
0 {} {}
1 {2} {}
(df3 & df4).astype(bool)
a b
0 False False
1 True False
user3847943's solution is a good alternative, but could be improved using a set
membership test.
def find_in_array(a, b):
return a in b
for c in df2.columns:
df2[c] = df2[c].map(set)
vfunc = np.vectorize(find_in_array)
df = pd.DataFrame(vfunc(df1, df2), index=df1.index, columns=df1.columns)
df
a b
0 False False
1 True False
Upvotes: 4
Reputation: 31
You can easily do this by using numpy.vectorize. Sample code as below.
import numpy as np
import pandas as pd
t1 = pd.DataFrame([[1, 2],[2,3]])
t2 = pd.DataFrame([[[2],[3]],[[1,2],[1,2]]])
def find_in_array(a, b):
return a in b
vfunc = np.vectorize(find_in_array)
print(vfunc(t1, t2))
Upvotes: 1