Reputation: 61
I'm looking for a way to duplicate all columns in a dataframe, and have the duplicated column as the original name with a '_2' on the end.
Example:
d = {'col1': [1, 2], 'col2': [3, 4]}
start_df = pd.DataFrame(data=d)
d2 = {'col1':[1,2],'col1_2':[1,2],'col2':[3,4],'col2_2':[3,4]}
end_df = pd.DataFrame(data=d2)
Thanks.
Upvotes: 0
Views: 2693
Reputation: 31
Adding to Akmal Soliev's answer: If you want the duplicated columns directly after each original column, you have to adjust his code as following:
import pandas as pd
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
for i, col in enumerate(df.columns):
df.insert(df.columns.get_loc(col)+1, col+'_2', "")
df
Upvotes: 1
Reputation: 742
Use .insert()
function:
import pandas as pd
d = {'col1': [1, 2], 'col2': [3, 4]}
start_df = pd.DataFrame(data=d)
for i, col in enumerate(start_df.columns):
start_df.insert(i+1, col+'_2', start_df[col])
start_df
output:
Out[1]:
col1 col1_2 col2_2 col2
0 1 1 3 3
1 2 2 4 4
Upvotes: 1
Reputation: 262519
NB. this answer demonstrates a generalization of the process
Without any loop for generating the dataframe, you can simple use the repeat
method of the columns index.
Then you can set columns names programmatically with a list comprehension.
For 2 repeats:
end_df = start_df[start_df.columns.repeat(2)]
end_df.columns = [f'{a}{b}' for a in start_df for b in ('', '_2')]
output:
col1 col1_2 col2 col2_2
0 1 1 3 3
1 2 2 4 4
Generalization:
n = 5
end_df = start_df[start_df.columns.repeat(n)]
end_df.columns = [f'{a}{b}' for a in start_df
for b in ['']+[f'_{x+1}' for x in range(1,n)]]
Example n=5:
col1 col1_2 col1_3 col1_4 col1_5 col2 col2_2 col2_3 col2_4 col2_5
0 1 1 1 1 1 3 3 3 3 3
1 2 2 2 2 2 4 4 4 4 4
Upvotes: 1
Reputation: 1059
Try this:
d = {'col1': [1, 2], 'col2': [3, 4]}
start_df = pd.DataFrame(data = d)
for column in start_df.columns:
start_df[column + '_2'] = start_df[column]
Upvotes: 1