Reputation: 1607
I have a one CSV files in which I want rename some of the columns with same name. my initial code looks like this
df = pd.read_csv('New.csv')
I extracted selected columns from dataframe with this code
df.columns[1::3]
this slicing gets every third columns. Now I want to rename those every third columns with same names but trying to rename my columns like this gives me an error
df.columns[1::3]= ['SomeName']
raise TypeError("Index does not support mutable operations")
Is there any way I can rename multiple columns with same name in pandas ?
Any other suggestions than doing this manually ?
Upvotes: 4
Views: 11443
Reputation: 164823
Underlying pandas
index objects are numpy
arrays.
You can take advantage of this fact to slice and assign using numpy
conventions.
Data from @jezrael. The need to extract values explicitly is necessitated by this known issue.
df = pd.DataFrame({'A':list('abcdef'),
'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0],
'E':[5,3,6,9,2,4],
'F':list('aaabbb')})
arr = df.columns.values
arr[1::3] = range(2)
df.columns = arr
print(df)
A 0 C D 1 F
0 a 4 7 1 5 a
1 b 5 8 3 3 a
2 c 4 9 5 6 a
3 d 5 4 7 9 b
4 e 5 2 1 2 b
5 f 4 3 0 4 b
Upvotes: 1
Reputation: 11
df.columns is of type pandas.indexes.base.Index
which is why you are getting the TypeError. If you convert it to a list, then you can update (rename) using the slice, and set df.columns to that updated list.
This works for me:
lst = list(df.columns)
lst[1::3] = ['someName']*len(lst[1::3])
df.columns = lst
or to have unique new column names (as @jezrael pointed out it might not be recommended to use the same name):
lst = list(df.columns)
lst[1::3] = ['someName{}'.format(x) for x in range(len(lst[1::3]))]
df.columns = lst
Upvotes: 1
Reputation: 863611
I think best here is use rename
with unique new columns names like:
df = pd.DataFrame({'A':list('abcdef'),
'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0],
'E':[5,3,6,9,2,4],
'F':list('aaabbb')})
print (df)
A B C D E F
0 a 4 7 1 5 a
1 b 5 8 3 3 a
2 c 4 9 5 6 a
3 d 5 4 7 9 b
4 e 5 2 1 2 b
5 f 4 3 0 4 b
d = dict(zip(df.columns[1::3], range(len(df.columns[1::3]))))
print (d)
{'B': 0, 'E': 1}
df = df.rename(columns=d)
print (df)
A 0 C D 1 F
0 a 4 7 1 5 a
1 b 5 8 3 3 a
2 c 4 9 5 6 a
3 d 5 4 7 9 b
4 e 5 2 1 2 b
5 f 4 3 0 4 b
Or:
d = dict(zip(df.columns[1::3],
['name{}'.format(x) for x in range(len(df.columns[1::3]))]))
print (d)
{'B': 'name0', 'E': 'name1'}
df = df.rename(columns=d)
print (df)
A name0 C D name1 F
0 a 4 7 1 5 a
1 b 5 8 3 3 a
2 c 4 9 5 6 a
3 d 5 4 7 9 b
4 e 5 2 1 2 b
5 f 4 3 0 4 b
Not recommended solution is rename for same column names:
d = dict.fromkeys(df.columns[1::3], 'Name')
print (d)
{'B': 'Name', 'E': 'Name'}
df = df.rename(columns=d)
print (df)
A Name C D Name F
0 a 4 7 1 5 a
1 b 5 8 3 3 a
2 c 4 9 5 6 a
3 d 5 4 7 9 b
4 e 5 2 1 2 b
5 f 4 3 0 4 b
because if want seelct column Name
it return all columns in DataFrame
:
print (df['Name'])
Name Name
0 4 5
1 5 3
2 4 6
3 5 9
4 5 2
5 4 4
Upvotes: 3