justanewb
justanewb

Reputation: 133

Merging and stacking dataframes together

Say I have one dataframe that looks like this

d1 = {'Location+Type':['Census Tract 3, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000300', 'Census Tract 4, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000400','Census Tract 5, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000500'],
'Year':['2009','2009','2009'],
'state':['Alabama','Alabama','Alabama'],
'Census_tract':[3,3,3],
'County_name':['Jefferson County','Jefferson County','Jefferson County'],
'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9],
}
df1 = pd.DataFrame(data=d1)

and I have another dataframe that looks like this

d2 = {'Location+Type':['Census Tract 3, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000300', 'Census Tract 4, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000400','Census Tract 5, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000500'],
'Year':['2010','2010','2010'],
'state':['Alabama','Alabama','Alabama'],
'Census_tract':[3,3,3],
'County_name':['Jefferson County','Jefferson County','Jefferson County'],
'A':[11,22,33],
'B':[44,55,66],
'C':[77,88,99],
'D': [10,20,30]
}
df2 = pd.DataFrame(data=d2)

I want to merge these dataframes together such that I would have this dataframe

d3 = {'Location+Type':['Census Tract 3, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000300', 'Census Tract 4, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000400','Census Tract 5, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000500', 'Census Tract 3, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000300', 'Census Tract 4, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000400','Census Tract 5, Jefferson County, Alabama: Summary level: 140, state:01> county:073> tract:000500'],
'Year':['2009','2009','2009','2010','2010','2010'],
'state':['Alabama','Alabama','Alabama','Alabama','Alabama','Alabama'],
'Census_tract':[3,3,3,3,3,3],
'County_name':['Jefferson County','Jefferson County','Jefferson County', 'Jefferson County','Jefferson County','Jefferson County']
'A':[1,2,3,11,22,33],
'B':[4,5,6,44,55,66],
'C':[7,8,9,77,88,99],
'D':[None,None,None,10,20,30]
}

df_final = pd.DataFrame(data=d3)

I know there are different join operations but this one is sort of a join and stack. Thus I am not sure how to achieve the latter result. Any suggestions are greatly appreciated.

Upvotes: 1

Views: 55

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195418

You can try pd.concat:

df = pd.concat([df1, df2]).drop(columns=["County_name"]).reset_index(drop=True)
print(df)

Prints:

                                       Location+Type  Year    state  Census_tract   A   B   C     D
0  Census Tract 3, Jefferson County, Alabama: Sum...  2009  Alabama             3   1   4   7   NaN
1  Census Tract 4, Jefferson County, Alabama: Sum...  2009  Alabama             3   2   5   8   NaN
2  Census Tract 5, Jefferson County, Alabama: Sum...  2009  Alabama             3   3   6   9   NaN
3  Census Tract 3, Jefferson County, Alabama: Sum...  2009  Alabama             3  11  44  77  10.0
4  Census Tract 4, Jefferson County, Alabama: Sum...  2009  Alabama             3  22  55  88  20.0
5  Census Tract 5, Jefferson County, Alabama: Sum...  2009  Alabama             3  33  66  99  30.0

Upvotes: 1

Related Questions