ParalysisByAnalysis
ParalysisByAnalysis

Reputation: 733

Conditionally change column dtype based on column subtext

I have the following data:

df
A_Key  B_ID  C_Key  D_NA
123    22    343.0  23
121    23    45.4   52

df.dtypes leads to the following:

df.dtypes

A_Key  int64
B_ID   int64
C_Key  float
D_NA   int64

How can I conditionally change the the dtypes for any columns with "Key" or "ID" to an object? I have over a hundred columns in my actual dataframe, so I hope to have a lookup method.

My current method uses the following code, but clearly it isn't pythonic and requires individual hardcoding:

for col in ['A_Key',
            'B_ID',
            'C_Key']:
    df[col] = df[col].astype('object')

My df.dtypes output should then show as follows:

df.dtypes

A_Key  object
B_ID   object
C_Key  object
D_NA   int64

Thank you ahead of time for your help.

Upvotes: 2

Views: 336

Answers (3)

cs95
cs95

Reputation: 402814

Use case-insensitive regex matching with str.contains:

m = df.columns.str.contains('(?i)key|id')
df.iloc[:, m] = df.iloc[:, m].astype(object)

df.dtypes

A_Key    object
B_ID     object
C_Key    object
D_NA      int64
dtype: object

Upvotes: 1

Vaishali
Vaishali

Reputation: 38415

Try

cols = df.columns[df.columns.str.contains('Key|ID')]
df[cols] = df[cols].astype('O')

print(df.dtypes)

A_Key    object
B_ID     object
C_Key    object
D_NA      int64

Upvotes: 2

Ankur Ankan
Ankur Ankan

Reputation: 3066

This should work:

for col in df.columns:
    if 'KEY' in col or 'ID' in col:
        df[col] = df[col].astype('object')

Upvotes: 2

Related Questions