Reputation: 1247

Pandas: Adding suffix to column names programmatically while dropping out common columns after concat

I have 3 dataframes where several columns are same. I want to make some of them unique. And remaining of them should be dropped.

df1.columns = ['payeeId', 'name','amount','Category','Rule_No','Short Rule Name','Incident','Flag_Type']
df2.columns = ['payeeId', 'name','amount','Category','Rule_No','Short Rule Name','Incident','Flag_Type']
df3.columns = ['payeeId', 'name','amount','Category','Rule_No','Short Rule Name','Incident','Flag_Type']
df4.columns = ['payeeId', 'name','amount','Category','Rule_No','Short Rule Name','Incident','Flag_Type']

Now I want Short Rule Name, Incident, Rule_No and Flag_Type to be renamed with a suffix, say, df1, df2 and df3. Remaining columns can be dropped from combined df.

If I do the following:

dfs = [df1,df2,df3]
df_f = pd.concat(dfs,axis=1)
df_f = df_f.loc[:,~df_f.columns.duplicated()]

This will remove all duplicated columns and will be left with one set of columns (for df1).

So one way to prevent the unwanted deletion of column could be renaming the selected columns as indicated above.

How can I achieve the same. I want programmatically adding suffix to those selected columns.

Upvotes: 0

Answers (2)

jezrael

Reputation: 863301

You can rename columns from list in list comprehension with enumerate for counter:

cols = ["Short Rule Name", "Incident", "Rule_No", "Flag_Type"]
dfs = [df1, df2, df3]
dfs = [df.rename(columns=dict(zip(cols, [f'{x}_{i}' for x in cols]))) 
       for i, df in enumerate(dfs, 1)]

df_f = pd.concat(dfs,axis=1)
df_f = df_f.loc[:,~df_f.columns.duplicated()]

Sample for first part:

cols = ["Short Rule Name", "Incident", "Rule_No", "Flag_Type"]

df1 = pd.DataFrame(columns=cols)
df2 = pd.DataFrame(columns=cols + ['a', 'b'])
df3 = pd.DataFrame(columns=cols + ['a', 'c'])
df4 = pd.DataFrame(columns=cols + ['b', 'c'])
dfs = [df1, df2, df3, df4]

dfs = [df.rename(columns=dict(zip(cols, [f'{x}_{i}' for x in cols]))) 
       for i, df in enumerate(dfs, 1)]

print (dfs)
[Empty DataFrame
Columns: [Short Rule Name_1, Incident_1, Rule_No_1, Flag_Type_1]
Index: [], Empty DataFrame
Columns: [Short Rule Name_2, Incident_2, Rule_No_2, Flag_Type_2, a, b]
Index: [], Empty DataFrame
Columns: [Short Rule Name_3, Incident_3, Rule_No_3, Flag_Type_3, a, c]
Index: [], Empty DataFrame
Columns: [Short Rule Name_4, Incident_4, Rule_No_4, Flag_Type_4, b, c]
Index: []]

Upvotes: 1

GordyNumberOne

Reputation: 11

You can add a new column with the content of the old one and a new label like:

df1["Short Rule Name df1"] = df1["Short Rule Name"]

Then drop the old column using drop()

And to do it programmatically, you can pop it in a for loop

dfs = [df1, df2, df3]
i=1
for df in dfs:
    df["Short Rule Name df" + str(i)] = df["Short Rule Name"]
    df.drop("Short Rule Name", axis=1, inplace=True)
    i+=1

Upvotes: 0

Pandas: Adding suffix to column names programmatically while dropping out common columns after concat

Answers (2)

Related Questions