vinay karagod
vinay karagod

Reputation: 254

How to check with multiple substrings to get the column names in python?

I have a data frame in python pandas I pulling the columns based on the below condition

spike_cols = [col for col in nodes.columns if 'Num' in col]
print(spike_cols)

But I am looking for multiple substrings to check in the columns if exist I want to pull all the columns that match any one of the substring

spike_cols = [col for col in nodes.columns if ('Num'|'Lice') in col]
    print(spike_cols)

But I am getting below error

: unsupported operand type(s) for |: 'str' and 'str'

Upvotes: 0

Views: 1142

Answers (3)

rahlf23
rahlf23

Reputation: 9019

You can use Series.str.contains:

df[df.columns[df.columns.str.contains(r'Num|Lice')]]

If all you want is the column names themselves:

df.columns[df.columns.str.contains(r'Num|Lice')].tolist()

Upvotes: 2

Erfan
Erfan

Reputation: 42916

You can use DataFrame.filter for this in combination with regex argument:

# Create example dataframe
df = pd.DataFrame({'HelloNum': [1,2],
                  'World':[3,4],
                  'This':[5,6],
                  'ExampleLice':[7,8]})

print(df)

   HelloNum  World  This  ExampleLice
0         1      3     5            7
1         2      4     6            8

Apply DataFrame.filter

print(df.filter(regex='Num|Lice'))
   HelloNum  ExampleLice
0         1            7
1         2            8

Get column names in list

df.filter(regex='Num|Lice').columns.tolist()

['HelloNum', 'ExampleLice']

Upvotes: 2

Mahmoud Elshahat
Mahmoud Elshahat

Reputation: 1959

try this:

spike_cols = [col for col in nodes.columns if ('Num' in col or 'Lice' in col)]

Upvotes: 1

Related Questions