Reputation:
How would I paste these two tables together using join/merge/concat/append/add in such a way that the population age 0-14, and 15-64 columns are side-by-side?
I don't need the cartesian product of these two DataFrames.
I've tried:
population ages = t3.merge(t4, on='Country Name', how='inner')
t3
Country Name Year Population Age 0 - 14
0 Aruba 1960 43.847771
1 Andorra 1960 NaN
2 Afghanistan 1960 43.712284
3 Angola 1960 43.759289
4 Albania 1960 41.757282
t4
Country Name Population Age 15 - 64
0 Aruba 53.667355
1 Andorra NaN
2 Afghanistan 53.834637
3 Angola 53.587101
4 Albania 52.941044
Ideally
Country Name Population Age 15 - 64 Population Ages 0 - 14
0 Aruba 53.667355 43.847771
1 Andorra NaN NaN
2 Afghanistan 53.834637 43.712284
3 Angola 53.587101 43.759289
4 Albania 52.941044 41.757282
Test results:
population_ages = t3.merge(t4, on='Country Name', how='inner')
I received a DataFrame that's a cartesian product of t3, t4 with a shape of (734832, 4) instead of (13608, 4)
Country Name Year Population Age 0 - 14 Population Age 15 - 64
0 Aruba 1960 43.847771 53.667355
1 Aruba 1960 43.847771 53.890141
2 Aruba 1960 43.847771 54.216911
3 Aruba 1960 43.847771 54.637810
4 Aruba 1960 43.847771 55.119324
5 Aruba 1960 43.847771 55.631104
6 Aruba 1960 43.847771 56.168560
7 Aruba 1960 43.847771 56.736549
8 Aruba 1960 43.847771 57.341782
9 Aruba 1960 43.847771 57.983109
10 Aruba 1960 43.847771 58.674343
11 Aruba 1960 43.847771 59.404758
12 Aruba 1960 43.847771 60.164749
Upvotes: 2
Views: 1523
Reputation: 142641
How about
t4['Population Age 0 - 14'] = t3['Population Age 0 - 14']
or
pd.concat( t4, t3['Population Age 0 - 14'], axis=1)
full working example:
import pandas as pd
from StringIO import StringIO
d1 = '''Country Name Year Population Age 0 - 14
Aruba 1960 43.847771
Andorra 1960 NaN
Afghanistan 1960 43.712284
Angola 1960 43.759289
Albania 1960 41.757282'''
d2 = '''Country Name Population Age 15 - 64
Aruba 53.667355
Andorra NaN
Afghanistan 53.834637
Angola 53.587101
Albania 52.941044'''
t3 = pd.DataFrame.from_csv( StringIO(d1), sep='\s{2,}', index_col=None )
print '\nt3:\n',t3
t4 = pd.DataFrame.from_csv( StringIO(d2), sep='\s{2,}', index_col=None )
print '\nt4:\n',t3
print '\n--- merge ---\n'
print pd.merge( t4, t3, on='Country Name')
print pd.merge( t4, t3[ ['Country Name', 'Population Age 0 - 14'] ], on='Country Name')
print '\n--- concat ---\n'
print pd.concat( (t4, t3['Population Age 0 - 14']), axis=1)
print '\n--- [xxx] = [xxx] ---\n'
t4['Population Age 0 - 14'] = t3['Population Age 0 - 14']
print t4
result:
Country Name Population Age 15 - 64 Population Age 0 - 14
0 Aruba 53.667355 43.847771
1 Andorra NaN NaN
2 Afghanistan 53.834637 43.712284
3 Angola 53.587101 43.759289
4 Albania 52.941044 41.757282
Upvotes: 1