Reputation: 295
I've a DataFrame like that :
col1 col2 col3 col4 col5 col6 col7 col8
0 5345 rrf rrf rrf rrf rrf rrf
1 2527 erfr erfr erfr erfr erfr erfr
2 2727 f f f f f f
I would like to rename all columns but not col1 and col2.
So I tried to make a loop
print(df.columns)
for col in df.columns:
if col != 'col1' and col != 'col2':
col.rename = str(col) + '_x'
But it's not very efficient...it doesn't work !
Upvotes: 13
Views: 27773
Reputation: 4610
As EdChum suggested, I used str.contains
and ~ to filter out the columns
cols = df.columns[~df.columns.str.contains('col1|col2')]
then used rename function of pandas
df.rename(columns={col: col + '_x' for col in df.columns if col in cols}, inplace=True)
P.S. df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
did not work in my case.
Upvotes: 1
Reputation: 7923
You can use the DataFrame.rename() method
new_names = [(i,i+'_x') for i in df.iloc[:, 2:].columns.values]
df.rename(columns = dict(new_names), inplace=True)
Upvotes: 23
Reputation: 863741
Simpliest solution if col1
and col2
are first and second column names:
df.columns = df.columns[:2].union(df.columns[2:] + '_x')
print (df)
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
Another solution with isin
or list comprehension:
cols = df.columns[~df.columns.isin(['col1','col2'])]
print (cols)
['col3', 'col4', 'col5', 'col6', 'col7', 'col8']
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
print (df)
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
cols = [col for col in df.columns if col not in ['col1', 'col2']]
print (cols)
['col3', 'col4', 'col5', 'col6', 'col7', 'col8']
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
print (df)
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
The fastest is list comprehension:
df.columns = [col+'_x' if col != 'col1' and col != 'col2' else col for col in df.columns]
Timings:
In [350]: %timeit (akot(df))
1000 loops, best of 3: 387 µs per loop
In [351]: %timeit (jez(df1))
The slowest run took 4.12 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 207 µs per loop
In [363]: %timeit (jez3(df2))
The slowest run took 6.41 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 75.7 µs per loop
df1 = df.copy()
df2 = df.copy()
def jez(df):
df.columns = df.columns[:2].union(df.columns[2:] + '_x')
return df
def akot(df):
new_names = [(i,i+'_x') for i in df.iloc[:, 2:].columns.values]
df.rename(columns = dict(new_names), inplace=True)
return df
def jez3(df):
df.columns = [col + '_x' if col != 'col1' and col != 'col2' else col for col in df.columns]
return df
print (akot(df))
print (jez(df1))
print (jez2(df1))
Upvotes: 11
Reputation: 394459
You can use str.contains
with a regex pattern to filter the cols of interest, then using zip
construct a dict and pass this as the arg to rename
:
In [94]:
cols = df.columns[~df.columns.str.contains('col1|col2')]
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
df
Out[94]:
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
So here using str.contains
to filter the columns will return the columns that don't match so the column order is irrelevant
Upvotes: 5