Reputation: 2616

Replace existing column name while adding new columns with empty string to pandas dataframe

Say I have a dataframe like below:

df = pd.DataFrame({0:['Hello World!']}) # here df could have more than one column of data as shown below
df = pd.DataFrame({0:['Hello World!'], 1:['Hello Mars!']}) # or df could have more than one row of data as shown below
df = pd.DataFrame({0:['Hello World!', 'Hello Mars!']})

and I also have a list of column names like below:

new_col_names = ['a','b','c','d'] # here, len(new_col_names) might vary like below
new_col_names = ['a','b','c','d','e'] # but we can always be sure that the len(new_col_names) >= len(df.columns)

Given that, how could I replace the column names in df such that it results something like below:

df = pd.DataFrame({0:['Hello World!']})
new_col_names = ['a','b','c','d']
# result would be like this
a               b               c               d
Hello World!    (empty string)  (empty string)  (empty string)


df = pd.DataFrame({0:['Hello World!'], 1:['Hello Mars!']}) 
new_col_names = ['a','b','c','d']
# result would be like this
a               b               c               d
Hello World!    Hello Mars!     (empty string)  (empty string)


df = pd.DataFrame({0:['Hello World!', 'Hello Mars!']})
new_col_names = ['a','b','c','d','e']
a               b               c               d               e
Hello World!    (empty string)  (empty string)  (empty string)  (empty string)
Hellow Mars!    (empty string)  (empty string)  (empty string)  (empty string)

From reading around StackOverflow answers such as this, I have a vague idea that it could be something like below:

df[new_col_names] = '' # but this returns KeyError
# or this
df.columns=new_col_names # but this returns ValueError: Length mismatch (of course)

If someone could show me, a way to overwrite existing dataframe column name and at the same time add new data columns with empty string values in the rows, I'd greatly appreciate the help.

Upvotes: 1

Answers (3)

Trenton McKinney

Reputation: 62393

Here is a function that will do what you want

I couldn't find a 1-liner, but jezrael did: his answer

import pandas as pd

# function
def rename_add_col(df: pd.DataFrame, cols: list) -> pd.DataFrame:
    c_len = len(df.columns)
    if c_len == len(cols):
        df.columns = cols
    else:
        df.columns = cols[:c_len]
        df = pd.concat([df, pd.DataFrame(columns=cols[c_len:])]) 
    return df

# create dataframe
t1 = pd.DataFrame({'a': ['1', '2', '3'], 'b': ['4', '5', '6'], 'c': ['7', '8', '9']})

    a   b   c
0   1   4   7
1   2   5   8
2   3   6   9

# call function
cols = ['d', 'e', 'f']
t1 = rename_add_col(t1, cols)

    d   e   f
0   1   4   7
1   2   5   8
2   3   6   9

# call function
cols = ['g', 'h', 'i', 'new1', 'new2']
t1 = rename_add_col(t1, cols)


    g   h   i   new1    new2
0   1   4   7    NaN     NaN
1   2   5   8    NaN     NaN
2   3   6   9    NaN     NaN

Upvotes: 2

jezrael

Reputation: 862511

Idea is create dictionary by existing columns names by zip, rename only existing columns and then add all new one by DataFrame.reindex:

df = pd.DataFrame({0:['Hello World!', 'Hello Mars!']})
new_col_names = ['a','b','c','d','e']

df1 = (df.rename(columns=dict(zip(df.columns, new_col_names)))
        .reindex(new_col_names, axis=1, fill_value=''))
print (df1)
              a b c d e
0  Hello World!        
1   Hello Mars!      


df1 = (df.rename(columns=dict(zip(df.columns, new_col_names)))
         .reindex(new_col_names, axis=1))
print (df1)
              a   b   c   d   e
0  Hello World! NaN NaN NaN NaN
1   Hello Mars! NaN NaN NaN NaN

Upvotes: 3

PRANJAL BIYANI

Reputation: 61

This might help you do it all at once

Use your old Dataframe to recreate another dataframe with the pd.DataFrame() method and then add new columns in the columns paramater by list addition.

Note : This would add new columns as per index length, but with NaN values, workaround for which would be doing a df.fillna(' ')

pd.DataFrame(df.to_dict() , columns = list(df.columns)+['b','c'])

Hope this Helps! Cheers !

Upvotes: 1

Replace existing column name while adding new columns with empty string to pandas dataframe

Answers (3)

Here is a function that will do what you want

This might help you do it all at once

Related Questions