sos.cott
sos.cott

Reputation: 437

How do change column names if the column name contains certain substring?

I would like to drop the first three characters of my column name if the column name contains '_'.

My current column names look like this:

US_aaa   NL_bbb   CN_ccc
  abc     def      ghi
  123     345      456

I would like my data to look like:

aaa    bbb    ccc
abc    def    ghi
123    345    456

My current code looks like this:

for col in category.columns():
    if "_" in col:
        category[col]=category[col][3:]

Not sure what I'm doing wrong.

Upvotes: 0

Views: 1729

Answers (2)

Space Impact
Space Impact

Reputation: 13255

Your line after if statement is not correct also no need () for df.columns. Check the below code to get desired column names:

df
    US_aaa  NL_bbb  CN_ccc
0   abc     def     ghi
1   123     345     456

df.columns = [col[3:] if '_' in col  else col for col in df.columns]
df.columns
Index(['aaa', 'bbb', 'ccc'], dtype='object')

df
    aaa bbb ccc
0   abc def ghi
1   123 345 456

Upvotes: 3

Trenton McKinney
Trenton McKinney

Reputation: 62453

This solution doesn't care how many characters after the '_'

df = pd.DataFrame([['abc', 'def', 'ghi'], [123, 456, 789]], columns=['US_aaa', 'NL_bbb', 'CN_ccc'])

  US_aaa    NL_bbb  CN_ccc
0   abc def ghi
1   123 456 789

df.columns =[x.split('_')[-1] for x in df.columns]

    aaa bbb ccc
0   abc def ghi
1   123 456 789

if there's no '_':

USaaa   NL_bbb  CN_ccc
0   abc def ghi
1   123 456 789

you get:

USaaa   bbb ccc
0   abc def ghi
1   123 456 789

Upvotes: 2

Related Questions