Reputation: 2155
Is it possible to change the name of a column in a pandas data frame if it starts with a certain word.
e.e If column starts with DEP then change the full name to KEEP.
[col for col in df if col.startswith('DEP') then KEEP].
Upvotes: 6
Views: 3166
Reputation: 323226
I agree with jpp , do not using the duplicated column names
df.columns=df.columns.str.replace('DEP','KEEP')
df
Out[866]:
KEEP1 B C KEEP2
0 a 4 7 a
1 b 5 8 a
2 c 4 9 a
3 d 5 4 b
4 e 5 2 b
5 f 4 3 b
Upvotes: 3
Reputation: 862481
Is it possible, but not recommended, because get duplicated columns names:
df = pd.DataFrame({
'DEP1':list('abcdef'),
'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'DEP2':list('aaabbb')
})
print (df)
DEP1 B C DEP2
0 a 4 7 a
1 b 5 8 a
2 c 4 9 a
3 d 5 4 b
4 e 5 2 b
5 f 4 3 b
df.columns = ['KEEP' if col.startswith('DEP') else col for col in df]
print (df)
KEEP B C KEEP
0 a 4 7 a
1 b 5 8 a
2 c 4 9 a
3 d 5 4 b
4 e 5 2 b
5 f 4 3 b
So then if select column KEEP
it return all duplicated columns in DataFrame:
print (df['KEEP'])
KEEP KEEP
0 a a
1 b a
2 c a
3 d b
4 e b
5 f b
So if want filter all columns starting with DEP
use filter
with regex ^
for starting string:
df1 = df.filter(regex='^DEP')
#alternative solution
#df1 = df.loc[:, df.columns.str.startswith('DEP')]
print (df1)
DEP1 DEP2
0 a a
1 b a
2 c a
3 d b
4 e b
5 f b
Upvotes: 8
Reputation: 164623
This isn't what you've requested, but I strongly recommend you keep characters after DEP
to differentiate your column labels. The reason is much of Pandas functionality becomes unusable and you lose information by using duplicate labels.
One way is to use a dictionary mapping:
df = pd.DataFrame({'DEP123': [1, 2, 3], 'Test': [4, 5, 6],
'DEP983': [7, 8, 9], 'Test2': [10, 11, 12]})
mapper = {col: f'KEEP{col[3:]}' for col in df.filter(regex='^DEP')}
df = df.rename(columns=mapper)
print(df)
KEEP123 KEEP983 Test Test2
0 1 7 4 10
1 2 8 5 11
2 3 9 6 12
Upvotes: 3
Reputation: 294218
You can pass a callable to rename
df.rename(columns=lambda x: 'KEEP' if x.startswith('DEP') else x)
You can either reassign this to the name df
or use the inplace=True
argument.
Upvotes: 5