Reputation: 81
I am basically trying to do the following:
import pandas as pd
initial_df = {'county': ['REAGAN', 'UPTON', 'HARDEMAN', 'UPTON'], 'values': [508,
364, 26, 870]}
intermediate_df = {'county': ['REAGAN', 'HARDEMAN', 'UPTON'], 'fips': [48383, 47069,
48461]}
final_df = {'county': ['REAGAN', 'UPTON', 'HARDEMAN', 'UPTON'], 'fips': [48383,
48461, 47069, 48461], 'values': [508, 364, 26, 870]}
df1=pd.DataFrame(initial_df)
df2=pd.DataFrame(intermediate_df)
df3=df1.merge(df2)
However when i try to apply the same concept to my actual data, my final dataframe (df3) has no rows. im trying to assign a fips to each of my county names.
df1 = pd.read_csv('https://raw.githubusercontent.com/chessybo/Oil-Spill-map/master/Crude%20Oil%2C%20Gas%20Well%20Liquids%20or%20Associated%20Products%20(H-8)%20Loss%20Reports/all-geocodes-v2016.csv', encoding='latin-1')
df2 = pd.read_csv('https://raw.githubusercontent.com/chessybo/Oil-Spill-map/master/Crude%20Oil%2C%20Gas%20Well%20Liquids%20or%20Associated%20Products%20(H-8)%20Loss%20Reports/h8s-2018.csv')
df2['county_name'] = map(str.lower, df2['county_name'])
df1['county_name'] = map(str.lower, df1['county_name'])
df1['fips_county'] = df1['fips_county'].apply(lambda x: str(int(x)).zfill(3))
df1['fips'] = df1.apply(lambda x:'%s%s' % (x['fips_state'],x['fips_county']),axis=1)
df3=df2.merge(df1)
Upvotes: 1
Views: 2167
Reputation: 38415
The problem is with the code you are using to convert county_name to lower. Map returns an iterator and you need to store it in a data type. Plus when you are using pandas, you can simply use the pandas str methods.
df2['county_name'] = df2['county_name'].str.lower()
df1['county_name'] = df1['county_name'].str.lower()
df1['fips_county'] = df1['fips_county'].astype(str).str.zfill(3)
df1['fips'] = df1[['fips_state','fips_county']].astype(str).apply(lambda x: ''.join(x), axis=1)
df1.merge(df1)
You get
fips_state fips_county county_name fips
0 48 000 texas 48000
1 48 001 anderson 48001
2 48 003 andrews 48003
3 48 005 angelina 48005
Upvotes: 1
Reputation: 4607
use list in converting the map function values to list, if you want to merge those by 'county_name'. we cannot use the class values in merging the dataframes.
df1 = pd.read_csv('https://raw.githubusercontent.com/chessybo/Oil-Spill-map/master/Crude%20Oil%2C%20Gas%20Well%20Liquids%20or%20Associated%20Products%20(H-8)%20Loss%20Reports/all-geocodes-v2016.csv', encoding='latin-1')
df2 = pd.read_csv('https://raw.githubusercontent.com/chessybo/Oil-Spill-map/master/Crude%20Oil%2C%20Gas%20Well%20Liquids%20or%20Associated%20Products%20(H-8)%20Loss%20Reports/h8s-2018.csv')
df2['county_name'] = list(map(str.lower, df2['county_name']))
df1['county_name'] = list(map(str.lower, df1['county_name']))
df1['fips_county'] = df1['fips_county'].apply(lambda x: str(int(x)).zfill(3))
df1['fips'] = df1.apply(lambda x:'%s%s' % (x['fips_state'],x['fips_county']),axis=1)
df2.merge(df1,on=['county_name'],how='outer')
Upvotes: 0