fred.schwartz
fred.schwartz

Reputation: 2155

change column name pandas

Is it possible to change the name of a column in a pandas data frame if it starts with a certain word.

e.e If column starts with DEP then change the full name to KEEP.

[col for col in df if col.startswith('DEP') then KEEP].

Upvotes: 6

Views: 3166

Answers (4)

BENY
BENY

Reputation: 323226

I agree with jpp , do not using the duplicated column names

df.columns=df.columns.str.replace('DEP','KEEP')
df
Out[866]: 
  KEEP1  B  C KEEP2
0     a  4  7     a
1     b  5  8     a
2     c  4  9     a
3     d  5  4     b
4     e  5  2     b
5     f  4  3     b

Upvotes: 3

jezrael
jezrael

Reputation: 862481

Is it possible, but not recommended, because get duplicated columns names:

df = pd.DataFrame({
        'DEP1':list('abcdef'),
         'B':[4,5,4,5,5,4],
         'C':[7,8,9,4,2,3],
         'DEP2':list('aaabbb')
})

print (df)
  DEP1  B  C DEP2
0    a  4  7    a
1    b  5  8    a
2    c  4  9    a
3    d  5  4    b
4    e  5  2    b
5    f  4  3    b

df.columns = ['KEEP' if col.startswith('DEP') else col for col in df]
print (df)
  KEEP  B  C KEEP
0    a  4  7    a
1    b  5  8    a
2    c  4  9    a
3    d  5  4    b
4    e  5  2    b
5    f  4  3    b

So then if select column KEEP it return all duplicated columns in DataFrame:

print (df['KEEP'])
  KEEP KEEP
0    a    a
1    b    a
2    c    a
3    d    b
4    e    b
5    f    b

So if want filter all columns starting with DEP use filter with regex ^ for starting string:

df1 = df.filter(regex='^DEP')
#alternative solution
#df1 = df.loc[:, df.columns.str.startswith('DEP')]
print (df1)
  DEP1 DEP2
0    a    a
1    b    a
2    c    a
3    d    b
4    e    b
5    f    b

Upvotes: 8

jpp
jpp

Reputation: 164623

This isn't what you've requested, but I strongly recommend you keep characters after DEP to differentiate your column labels. The reason is much of Pandas functionality becomes unusable and you lose information by using duplicate labels.

One way is to use a dictionary mapping:

df = pd.DataFrame({'DEP123': [1, 2, 3], 'Test': [4, 5, 6],
                   'DEP983': [7, 8, 9], 'Test2': [10, 11, 12]})

mapper = {col: f'KEEP{col[3:]}' for col in df.filter(regex='^DEP')}
df = df.rename(columns=mapper)

print(df)

   KEEP123  KEEP983  Test  Test2
0        1        7     4     10
1        2        8     5     11
2        3        9     6     12

Upvotes: 3

piRSquared
piRSquared

Reputation: 294218

You can pass a callable to rename

df.rename(columns=lambda x: 'KEEP' if x.startswith('DEP') else x)

You can either reassign this to the name df or use the inplace=True argument.

Upvotes: 5

Related Questions