Hendrik Wiese
Hendrik Wiese

Reputation: 2219

Check for existence of multiple columns

Is there a more sophisticated way to check if a dataframe df contains 2 columns named Column 1 and Column 2:

if numpy.all(map(lambda c: c in df.columns, ['Column 1', 'Columns 2'])):
    do_something()

Upvotes: 12

Views: 7118

Answers (4)

abdelgha4
abdelgha4

Reputation: 431

Also to check the existence of a list items in a dataframe columns, and still using isin, you can do the following:

col_list = ['A', 'B'] 
pd.index(col_list).isin(df.columns).all()

As explained in the accepted answer, .all() is to check if all items in col_list are present in the columns, while .any() is to test the presence of any of them.

Upvotes: 0

Perico
Perico

Reputation: 201

I know it's an old post...

From this answer:

if set(['Column 1', 'Column 2']).issubset(df.columns):
    do_something()

or little more elegant:

if {'Column 1', 'Column 2'}.issubset(df.columns):
    do_something()

Upvotes: 20

elPastor
elPastor

Reputation: 8966

The one issue with the given answer (and maybe it works for the OP) is that it tests to see if all of the dataframe's columns are in a given list - but not that all of the given list's items are in the dataframe columns.

My solution was:

test = all([ i in df.columns for i in ['A', 'B'] ])

Where test is a simple True or False

Upvotes: 4

jezrael
jezrael

Reputation: 862641

You can use Index.isin:

df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[5,3,6],
                   'F':[7,4,3]})

print (df)
   A  B  C  D  E  F
0  1  4  7  1  5  7
1  2  5  8  3  3  4
2  3  6  9  5  6  3

If need check at least one value use any

cols = ['A', 'B']
print (df.columns.isin(cols).any())
True

cols = ['W', 'B']
print (df.columns.isin(cols).any())
True

cols = ['W', 'Z']
print (df.columns.isin(cols).any())
False

If need check all values:

cols = ['A', 'B', 'C','D','E','F']
print (df.columns.isin(cols).all())
True

cols = ['W', 'Z']
print (df.columns.isin(cols).all())
False

Upvotes: 17

Related Questions