Reputation: 521
I have 2 dataframes:
df.head()
Out[2]:
Unnamed: 0 Symbol Date Close
0 4061 A 2016-01-13 36.515889
1 4062 A 2016-01-14 36.351784
2 4063 A 2016-01-15 36.351784
3 4064 A 2016-01-19 36.590483
4 4065 A 2016-01-20 35.934062
and
dfw.head()
Out[3]:
Symbol Weight
0 A (0.000002)
1 AA 0.000112
2 AAC (0.000004)
3 AAL 0.000006
4 AAMC 0.000002
ISSUE: Not every symbol if df will have a weight in dfw. If it does not I want to drop it from my new dataframe (all dates of it). If the symbol is in dfw I want to merge the weight in with df so that each row has symbol, date, close and weight. I have tried the following but get NaN values. I also am not sure how to remove all symbols with no weights even if I was successful.
dfall = df.merge(dfw, on='Symbol', how='left')
dfall.head()
Out[14]:
Unnamed: 0 Symbol Date Close Weight
0 4061 A 2016-01-13 36.515889 NaN
1 4062 A 2016-01-14 36.351784 NaN
2 4063 A 2016-01-15 36.351784 NaN
3 4064 A 2016-01-19 36.590483 NaN
4 4065 A 2016-01-20 35.934062 NaN
Upvotes: 1
Views: 100
Reputation: 109726
df_all = df[df.Symbol.isin(dfw.Symbol.unique())].merge(dfw, how='left', on='Symbol')
I am not sure why you are getting NaN values. Perhaps you have spaces in you your symbols? You can clean them via: dfw['Symbol'] = dfw.Symbol.str.strip()
You would need to do the same for df
.
>>> df_all
Unnamed: 0 Symbol Date Close Weight
0 4061 A 2016-01-13 36.515889 (0.000002)
1 4062 A 2016-01-14 36.351784 (0.000002)
2 4063 A 2016-01-15 36.351784 (0.000002)
3 4064 A 2016-01-19 36.590483 (0.000002)
4 4065 A 2016-01-20 35.934062 (0.000002)
Upvotes: 3