Martyn Bell
Martyn Bell

Reputation: 97

add a column to pandas dataframe based on row name, if row doesnt exist add it in

I have the following data frames with df1 been my master

df1 = pd.DataFrame([['a', '10'], ['b', '20']],
               columns=['letter', 'Jan'])
df2 = pd.DataFrame([['a', '1'], ['b', '2'], ['c', '2']],
               columns=['letter', 'feb'])
df3 = pd.DataFrame([['a', '10'], ['b', '20'], ['c', '20'], ['d', '20']],
               columns=['letter', 'march'])

how can I add df2 then df3 to df1 so it looks like the following.

 Letter jan feb march
0   a     10  1   10
1   b     20  2   20
2   c         2   20
3   d             20



res = pd.concat([df1, df2,df3])

I tired this and the letter keeps getting added in which is not what i want

  letter  Jan  feb march
0      a   10  NaN   NaN
1      b   20  NaN   NaN
0      a  NaN    1   NaN
1      b  NaN    2   NaN
2      c  NaN    2   NaN
0      a  NaN  NaN    10
1      b  NaN  NaN    20
2      c  NaN  NaN    20
3      d  NaN  NaN    20
 

Any idea how i can solve this problem ? Thanks

Upvotes: 1

Views: 512

Answers (2)

Shubham Sharma
Shubham Sharma

Reputation: 71689

Use DataFrame.merge to outer merge the dataframes df1, df2 and df3 on letter then use DataFrame.fillna to fill the NaN values:

df = (
    df1.merge(df2, on='letter', how='outer')
    .merge(df3, on='letter', how='outer').fillna('')
)

OR, If having large number of dataframes which requires merging we can use more concise solution using reduce + pd.merge and pass the list of dataframes which require merging to the reduce function:

from functools import reduce

df = reduce(lambda x, y:
            pd.merge(x, y, on='letter', how='outer'), [df1, df2, df3]).fillna('')

Result:

# print(df)

  letter Jan feb march
0      a  10   1    10
1      b  20   2    20
2      c       2    20
3      d            20

Upvotes: 1

Umer Rana
Umer Rana

Reputation: 178

Here's the solution of your problem.

df_outer = pd.merge(df1, df2, on='letter', how='outer')
df_outer = pd.merge(df_outer, df3, on='letter', how='outer')
df_outer = df_outer.fillna('')
df_outer

Upvotes: 2

Related Questions