Reputation: 87
I have two data frames, i want to join them so that i could check the quantity of the that week in every year in a single in a single data frame.
df1= City Week qty Year
hyd 35 10 2015
hyd 36 15 2015
hyd 37 11 2015
hyd 42 10 2015
hyd 23 10 2016
hyd 32 15 2016
hyd 37 11 2017
hyd 42 10 2017
pune 35 10 2015
pune 36 15 2015
pune 37 11 2015
pune 42 10 2015
pune 23 10 2016
pune 32 15 2016
pune 37 11 2017
pune 42 10 2017
df2= city Week qty Year
hyd 23 10 2015
hyd 32 15 2015
hyd 35 12 2016
hyd 36 15 2016
hyd 37 11 2016
hyd 42 10 2016
hyd 43 12 2016
hyd 44 18 2016
hyd 35 11 2017
hyd 36 15 2017
hyd 37 11 2017
hyd 42 10 2017
hyd 51 14 2017
hyd 52 17 2017
pune 35 12 2016
pune 36 15 2016
pune 37 11 2016
pune 42 10 2016
pune 43 12 2016
pune 44 18 2016
pune 35 11 2017
pune 36 15 2017
pune 37 11 2017
pune 42 10 2017
pune 51 14 2017
pune 52 17 2017
I want to join two data frames as shown in the result, i want to append the quantity of the that week in every year for each city in a single data frame.
city Week qty Year y2016_wk qty y2017_wk qty y2015_week qty
hyd 35 10 2015 2016_35 12 2017_35 11 nan nan
hyd 36 15 2015 2016_36 15 2017_36 15 nan nan
hyd 37 11 2015 2016_37 11 2017_37 11 nan nan
hyd 42 10 2015 2016_42 10 2017_42 10 nan nan
hyd 23 10 2016 nan nan 2017_23 x 2015_23 10
hyd 32 15 2016 nan nan 2017_32 y 2015_32 15
hyd 37 11 2017 2016_37 11 nan nan 2015_37 x
hyd 42 10 2017 2016_42 10 nan nan 2015_42 y
pune 35 10 2015 2016_35 12 2017_35 11 nan nan
pune 36 15 2015 2016_36 15 2017_36 15 nan nan
pune 37 11 2015 2016_37 11 2017_37 11 nan nan
pune 42 10 2015 2016_42 10 2017_42 10 nan nan
Upvotes: 1
Views: 70
Reputation: 1747
Personally I don't think your example output is that readable, so unless you need that format for a specific reason I might consider using a pivot table. I also think the code required is cleaner.
import pandas as pd
df3 = pd.concat([df1, df2], ignore_index=True)
df4 = df3.pivot(index='Week', columns='Year', values='qty')
print(df4)
Year 2015 2016 2017
Week
35 10.0 12.0 11.0
36 15.0 15.0 15.0
37 11.0 11.0 11.0
42 10.0 10.0 10.0
43 NaN 12.0 NaN
44 NaN 18.0 NaN
51 NaN NaN 14.0
52 NaN NaN 17.0
Upvotes: 1
Reputation: 164823
You can break down your task into a few steps:
df1
and df2
.Week
.axis=1
and reset_index
.Here is an example:
df = pd.concat([df1, df2], ignore_index=True)
dfs = [df[df['Year'] == y].rename(columns=lambda x: x+'_'+str(y) if x != 'Week' else x)\
.set_index('Week') for y in df['Year'].unique()]
res = pd.concat(dfs, axis=1).reset_index()
Result:
print(res)
Week qty_2015 Year_2015 qty_2016 Year_2016 qty_2017 Year_2017
0 35 10.0 2015.0 12.0 2016.0 11.0 2017.0
1 36 15.0 2015.0 15.0 2016.0 15.0 2017.0
2 37 11.0 2015.0 11.0 2016.0 11.0 2017.0
3 42 10.0 2015.0 10.0 2016.0 10.0 2017.0
4 43 NaN NaN 12.0 2016.0 NaN NaN
5 44 NaN NaN 18.0 2016.0 NaN NaN
6 51 NaN NaN NaN NaN 14.0 2017.0
7 52 NaN NaN NaN NaN 17.0 2017.0
Upvotes: 1