Berny
Berny

Reputation: 133

rename column based on conditional

I'm trying to rename column if column name contains a string, it creates a list of all column names and if "Date" is in one of the names then it ranames it. Everything works but the renaming part, no error shows, but then I print column names and the original name is still showing:

for f in files_xlsx:
wb = load_workbook(input_path + "\\" + f, read_only=True)
if 'New Leads' in wb.sheetnames:
    df = pd.read_excel(input_path + "\\" + f, sheet_name="New Leads")
    dtcol = [col for col in df.columns if "Date" in col]
    dtcol2 = str(dtcol)
    df.rename(columns={dtcol2: "Activity Date"}, inplace=True)
    cols = df.columns
    if "SOURCE" in cols:
        if df.SOURCE.map(lambda x: len(x)).max() == 0:
            df['File'] = f
            df_xlsx = pd.concat([df, df_xlsx], ignore_index=True)
            df_xlsx = df_xlsx[["Email","SOURCE","File"]]
        else:
            df_ns = df_ns.append([f], ignore_index=True)
    else:
        df_ns = df_ns.append([f], ignore_index=True)
else:
    df_ns = df_ns.append([f], ignore_index=True)

Upvotes: 1

Views: 1397

Answers (2)

jaydrill
jaydrill

Reputation: 54

Replace your line of code:

dtcol2 = str(dtcol)

with:

dtcol2 = dtcol[0]

This should work. Currently, dtcol is a list with one element: ['Date']
Or your can simplify your operation by:

df.columns = [re.sub(r'^Date$', 'Activity Date', col) for col in df.columns]

to replace what you have:

dtcol = [col for col in df.columns if "Date" in col]
dtcol2 = str(dtcol)
df.rename(columns={dtcol2: "Activity Date"}, inplace=True)

Note:

  • you need to "import re"
  • assume you are certain about the the string you want to search "Date", that is why I did r'^Date$' for search pattern

Upvotes: 0

Celius Stingher
Celius Stingher

Reputation: 18367

The problem is here:

dtcol = [col for col in df.columns if "Date" in col]
dtcol2 = str(dtcol)
df.rename(columns={dtcol2: "Activity Date"}, inplace=True)
cols = df.columns

dtcol is a list, even if it's a single element, when you use str(dtcol) you are creating a string for example '["Date 1"]' which doesn't exist in the columns of df. The function rename() on the other hand, does not generate en error when no value is found, instead it simply doesn't do anything and continues the script, that's why you are never seeing any changes. You would need to iterate over dtcol:

dtcol = [col for col in df.columns if "Date" in col]
for ex in dtcol:
   df.rename(columns={ex: "Activity Date"}, inplace=True)
cols = df.columns

Upvotes: 1

Related Questions