Alok
Alok

Reputation: 87

How to concat two dataframes in python

I have two data frames, i want to join them so that i could check the quantity of the that week in every year in a single in a single data frame.

df1=  City  Week        qty       Year
      hyd   35          10        2015
      hyd   36          15        2015
      hyd   37          11        2015
      hyd   42          10        2015
      hyd   23          10        2016
      hyd   32          15        2016
      hyd   37          11        2017
      hyd   42          10        2017
      pune  35          10        2015
      pune  36          15        2015
      pune  37          11        2015
      pune  42          10        2015
      pune  23          10        2016
      pune  32          15        2016
      pune  37          11        2017
      pune  42          10        2017

 df2= city  Week         qty       Year
      hyd   23          10        2015
      hyd   32          15        2015
      hyd   35          12        2016
      hyd   36          15        2016
      hyd   37          11        2016
      hyd   42          10        2016
      hyd   43          12        2016
      hyd   44          18        2016
      hyd   35          11        2017
      hyd   36          15        2017
      hyd   37          11        2017
      hyd   42          10        2017
      hyd   51          14        2017
      hyd   52          17        2017
      pune  35          12        2016
      pune  36          15        2016
      pune  37          11        2016
      pune  42          10        2016
      pune  43          12        2016
      pune  44          18        2016
      pune  35          11        2017
      pune  36          15        2017
      pune  37          11        2017
      pune  42          10        2017
      pune  51          14        2017
      pune  52          17        2017

I want to join two data frames as shown in the result, i want to append the quantity of the that week in every year for each city in a single data frame.

    city   Week  qty   Year   y2016_wk  qty    y2017_wk  qty y2015_week qty
     hyd    35   10    2015    2016_35    12     2017_35   11   nan      nan
     hyd    36   15    2015    2016_36    15     2017_36   15   nan      nan
     hyd    37   11    2015    2016_37    11     2017_37   11   nan      nan
     hyd    42   10    2015    2016_42    10     2017_42   10   nan      nan
     hyd    23   10    2016    nan        nan    2017_23   x    2015_23   10
     hyd    32   15    2016    nan        nan    2017_32   y    2015_32   15
     hyd    37   11    2017    2016_37    11       nan     nan 2015_37   x
     hyd    42   10    2017    2016_42    10       nan     nan 2015_42   y
     pune   35   10    2015    2016_35    12     2017_35     11  nan     nan 
     pune   36   15    2015    2016_36    15     2017_36     15  nan     nan
     pune   37   11    2015    2016_37    11     2017_37     11  nan     nan
     pune   42   10    2015    2016_42    10     2017_42     10  nan     nan

Upvotes: 1

Views: 70

Answers (2)

PyPingu
PyPingu

Reputation: 1747

Personally I don't think your example output is that readable, so unless you need that format for a specific reason I might consider using a pivot table. I also think the code required is cleaner.

import pandas as pd 

df3 = pd.concat([df1, df2], ignore_index=True)
df4 = df3.pivot(index='Week', columns='Year', values='qty')

print(df4)

Year  2015  2016  2017
Week                  
35    10.0  12.0  11.0
36    15.0  15.0  15.0
37    11.0  11.0  11.0
42    10.0  10.0  10.0
43     NaN  12.0   NaN
44     NaN  18.0   NaN
51     NaN   NaN  14.0
52     NaN   NaN  17.0

Upvotes: 1

jpp
jpp

Reputation: 164823

You can break down your task into a few steps:

  1. Combine your dataframes df1 and df2.
  2. Create a list of dataframes from your combined dataframe, splitting by year.
  3. At the same time, rename columns to reflect year, set index to Week.
  4. Finally, concatenate along axis=1 and reset_index.

Here is an example:

df = pd.concat([df1, df2], ignore_index=True)

dfs = [df[df['Year'] == y].rename(columns=lambda x: x+'_'+str(y) if x != 'Week' else x)\
                          .set_index('Week') for y in df['Year'].unique()]

res = pd.concat(dfs, axis=1).reset_index()

Result:

print(res)

   Week  qty_2015  Year_2015  qty_2016  Year_2016  qty_2017  Year_2017
0    35      10.0     2015.0      12.0     2016.0      11.0     2017.0
1    36      15.0     2015.0      15.0     2016.0      15.0     2017.0
2    37      11.0     2015.0      11.0     2016.0      11.0     2017.0
3    42      10.0     2015.0      10.0     2016.0      10.0     2017.0
4    43       NaN        NaN      12.0     2016.0       NaN        NaN
5    44       NaN        NaN      18.0     2016.0       NaN        NaN
6    51       NaN        NaN       NaN        NaN      14.0     2017.0
7    52       NaN        NaN       NaN        NaN      17.0     2017.0

Upvotes: 1

Related Questions