Reputation: 809
I have the following code below:
universitytowns = pd.merge(houses,unitowns,how='inner',on=['State','RegionName'])
However, my output is:
Empty DataFrame
Columns: [State, RegionName, 2000q1, 2000q2, 2000q3, 2000q4, 2001q1, 2001q2, 2001q3, 2001q4, 2002q1, 2002q2, 2002q3, 2002q4, 2003q1, 2003q2, 2003q3, 2003q4, 2004q1, 2004q2, 2004q3, 2004q4, 2005q1, 2005q2, 2005q3, 2005q4, 2006q1, 2006q2, 2006q3, 2006q4, 2007q1, 2007q2, 2007q3, 2007q4, 2008q1, 2008q2, 2008q3, 2008q4, 2009q1, 2009q2, 2009q3, 2009q4, 2010q1, 2010q2, 2010q3, 2010q4, 2011q1, 2011q2, 2011q3, 2011q4, 2012q1, 2012q2, 2012q3, 2012q4, 2013q1, 2013q2, 2013q3, 2013q4, 2014q1, 2014q2, 2014q3, 2014q4, 2015q1, 2015q2, 2015q3, 2015q4, 2016q1, 2016q2, 2016q3, difference]
Index: []
[0 rows x 70 columns]
I am confused as to why this is...
The dataframe houses looks like this:
State RegionName 2000q1 2000q2 2000q3 \
0 New York New York NaN NaN NaN
1 California Los Angeles 207066.666667 214466.666667 220966.666667
2 Illinois Chicago 138400.000000 143633.333333 147866.666667
3 Pennsylvania Philadelphia 53000.000000 53633.333333 54133.333333
4 Arizona Phoenix 111833.333333 114366.666667 116000.000000
2000q4 2001q1 2001q2 2001q3 2001q4 \
0 NaN NaN NaN NaN NaN
1 226166.666667 233000.000000 239100.000000 245066.666667 253033.333333
2 152133.333333 156933.333333 161800.000000 166400.000000 170433.333333
3 54700.000000 55333.333333 55533.333333 56266.666667 57533.333333
4 117400.000000 119600.000000 121566.666667 122700.000000 124300.000000
... 2014q3 2014q4 2015q1 2015q2 \
0 ... 522800.000000 528066.666667 532266.666667 540800.000000
1 ... 509066.666667 518866.666667 528800.000000 538166.666667
2 ... 195766.666667 201266.666667 201066.666667 206033.333333
3 ... 115300.000000 115666.666667 116200.000000 117966.666667
4 ... 165366.666667 168500.000000 171533.333333 174166.666667
2015q3 2015q4 2016q1 2016q2 2016q3 \
0 557200.000000 572833.333333 582866.666667 591633.333333 587200.0
1 547266.666667 557733.333333 566033.333333 577466.666667 584050.0
2 208300.000000 207900.000000 206066.666667 208200.000000 212000.0
3 121233.333333 122200.000000 123433.333333 126933.333333 128700.0
4 179066.666667 183833.333333 187900.000000 191433.333333 195200.0
difference
0 33933.333333
1 55600.000000
2 12300.000000
3 766.666667
4 25533.333333
The dataframe unitowns looks like this:
[5 rows x 70 columns]
State RegionName
0 Alabama Auburn
1 Alabama Florence
2 Alabama Jacksonville
3 Alabama Livingston
4 Alabama Montevallo
Would anybody be able to give me a helping hand?
Upvotes: 0
Views: 42
Reputation: 7614
It's working for me. Are you sure the 2 dataframes have a common state
and RegionName
. I just modified the unitowns
dataframe to include New York and New York and it worked, or there might be some extra characters, space, etc.
universitytowns = pd.merge(houses,unitowns,how='inner',on=['State','RegionName'])
print(universitytowns)
State RegionName 2000q1 2000q2 2000q3
0 New York New York NaN NaN NaN
Upvotes: 1