J. Doe
J. Doe

Reputation: 483

return true if list of values is not in dataframe column (every single value)

input data a:

obj  number
1    111
2    222
3    333
4    555

input data b:

obj  number
1    111
2    222
3    333
4    444

input data c:

obj  number
1    777
2    222
3    333
4    888

expected output data:

false
true
false

tried:

~set([111,444]).issubset(set(df_tmp['wahlnummer']))
not set([111,444]).issubset(set(df_tmp['wahlnummer']))
([111,444] not in df_tmp['wahlnummer'])

actual output a:

-2
-1
-1

actual output b:

false
true
true

actual output c:

unhashable type: 'list'

however most of the attemps find patterns where one of the 2 values is not in the dataframe column but not if both are not in. there should be some way for a or operator somehow.

Only return true if none of the values is in any row of the dataframe

If i use 111 or 433 not in df then it just thinks all of them don't have the value even if they do have both 1 or none.

edit2: mvce:

df_a = pd.DataFrame({'number': [111, 222, 333, 555]})
df_b = pd.DataFrame({'number': [111, 222, 333, 444]})
df_c = pd.DataFrame({'number': [777, 222, 333, 888]})
print (df_a)
print (df_b)
print (df_c)


print(not(set([111,444]).issubset(set(df_a['number']))))
print(not(set([111,444]).issubset(set(df_b['number']))))
print(not(set([111,444]).issubset(set(df_c['number']))))

result of this:

True
False
True

Upvotes: 1

Views: 1576

Answers (3)

jezrael
jezrael

Reputation: 863166

Use set.isdisjoint:

Return True if the set has no elements in common with other. Sets are disjoint if and only if their intersection is the empty set.

print ((set([111,444]).isdisjoint(set(a['number']))))
False
print ((set([111,444]).isdisjoint(set(b['number']))))
False
print ((set([111,444]).isdisjoint(set(c['number']))))
True

Upvotes: 0

Joe
Joe

Reputation: 889

Since you are comparing per row of three (3) different dataframes, you can just add the columns in question to one and do the comparison there, creating a new column for your result using np.where().

>>> df1 = pd.DataFrame({'obj':[1,2,3,4], 'number':[111,222,333,555]})
>>> df2 = pd.DataFrame({'obj':[1,2,3,4], 'number':[111,222,333,444]})
>>> df3 = pd.DataFrame({'obj':[1,2,3,4], 'number':[777,222,333,888]})
>>> df1
   obj  number
0    1     111
1    2     222
2    3     333
3    4     555
>>> df2
   obj  number
0    1     111
1    2     222
2    3     333
3    4     444
>>> df3
   obj  number
0    1     777
1    2     222
2    3     333
3    4     888

Creating the columns:

>>> df1['num from df2'] = df2['number']
>>> df1['num from df3'] = df3['number']
>>> df1
   obj  number  num from df2  num from df3
0    1     111           111           777
1    2     222           222           222
2    3     333           333           333
3    4     555           444           888

Now do the comparison using np.where(), I believe what you need is true to all to return True so we'll use &:

>>> df1['Conditon Result'] = np.where((df1['number'] == df1['num from df2']) & (df1['number'] == df1['num from df3']), [True], [False])
>>> df1
   obj  number  num from df2  num from df3  Conditon Result
0    1     111           111           777            False
1    2     222           222           222             True
2    3     333           333           333             True
3    4     555           444           888            False

Let me know if this helps :)).

Upvotes: 1

Andrew Lavers
Andrew Lavers

Reputation: 4378

df = pd.read_fwf(StringIO("""obj  number
1    433
2    342
3    111
4    345"""))

values1 = [111, 433]
values2 = [111, 433, 222]

print(all([any(df['number'] == v) for v in values1]))
print(all([any(df['number'] == v) for v in values2])

Output:

True
False

Upvotes: 0

Related Questions