Reputation: 2219
Is there a more sophisticated way to check if a dataframe df
contains 2 columns named Column 1
and Column 2
:
if numpy.all(map(lambda c: c in df.columns, ['Column 1', 'Columns 2'])):
do_something()
Upvotes: 12
Views: 7118
Reputation: 431
Also to check the existence of a list items in a dataframe columns, and still using isin
, you can do the following:
col_list = ['A', 'B']
pd.index(col_list).isin(df.columns).all()
As explained in the accepted answer, .all()
is to check if all items in col_list
are present in the columns, while .any()
is to test the presence
of any of them.
Upvotes: 0
Reputation: 201
I know it's an old post...
From this answer:
if set(['Column 1', 'Column 2']).issubset(df.columns):
do_something()
or little more elegant:
if {'Column 1', 'Column 2'}.issubset(df.columns):
do_something()
Upvotes: 20
Reputation: 8966
The one issue with the given answer (and maybe it works for the OP) is that it tests to see if all of the dataframe's columns are in a given list - but not that all of the given list's items are in the dataframe columns.
My solution was:
test = all([ i in df.columns for i in ['A', 'B'] ])
Where test
is a simple True
or False
Upvotes: 4
Reputation: 862641
You can use Index.isin
:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B C D E F
0 1 4 7 1 5 7
1 2 5 8 3 3 4
2 3 6 9 5 6 3
If need check at least one value use any
cols = ['A', 'B']
print (df.columns.isin(cols).any())
True
cols = ['W', 'B']
print (df.columns.isin(cols).any())
True
cols = ['W', 'Z']
print (df.columns.isin(cols).any())
False
If need check all
values:
cols = ['A', 'B', 'C','D','E','F']
print (df.columns.isin(cols).all())
True
cols = ['W', 'Z']
print (df.columns.isin(cols).all())
False
Upvotes: 17