Reputation: 323
I have historical ticker data in one dataframe, and historical dividend data in another dataframe. They look like this:
Date High Low Open Close Volume Adj Close
0 1993-01-29 43.968750 43.750000 43.968750 43.937500 1003200.0 26.453930
1 1993-02-01 44.250000 43.968750 43.968750 44.250000 480500.0 26.642057
2 1993-02-02 44.375000 44.125000 44.218750 44.343750 201300.0 26.698507
3 1993-02-03 44.843750 44.375000 44.406250 44.812500 529400.0 26.980742
4 1993-02-04 45.093750 44.468750 44.968750 45.000000 531500.0 27.093624
... ... ... ... ... ... ... ...
6809 2020-02-13 338.119995 335.559998 335.859985 337.059998 54501900.0 337.059998
6810 2020-02-14 337.730011 336.200012 337.510010 337.600006 64582200.0 337.600006
6811 2020-02-18 337.670013 335.209991 336.510010 336.730011 57226200.0 336.730011
6812 2020-02-19 339.079987 337.480011 337.790009 338.339996 48639100.0 338.339996
6813 2020-02-20 338.640015 333.681702 337.742004 336.894989 63193703.0 336.894989
action value
2019-12-20 DIVIDEND 1.570
2019-09-20 DIVIDEND 1.384
2019-06-21 DIVIDEND 1.432
2019-03-15 DIVIDEND 1.233
2018-12-21 DIVIDEND 1.435
... ... ...
1994-03-18 DIVIDEND 0.271
1993-12-17 DIVIDEND 0.317
1993-09-17 DIVIDEND 0.286
1993-06-18 DIVIDEND 0.318
1993-03-19 DIVIDEND 0.213
Date High Low Open Close Volume Adj Close Dividend
0 1993-01-29 43.968750 43.750000 43.968750 43.937500 1003200.0 26.453930
1 1993-02-01 44.250000 43.968750 43.968750 44.250000 480500.0 26.642057
2 1993-02-02 44.375000 44.125000 44.218750 44.343750 201300.0 26.698507
3 1993-02-03 44.843750 44.375000 44.406250 44.812500 529400.0 26.980742
4 1993-02-04 45.093750 44.468750 44.968750 45.000000 531500.0 27.093624
... ... ... ... ... ... ... ...
6809 2020-02-13 338.119995 335.559998 335.859985 337.059998 54501900.0 337.059998 1.570
6810 2020-02-14 337.730011 336.200012 337.510010 337.600006 64582200.0 337.600006 1.570
6811 2020-02-18 337.670013 335.209991 336.510010 336.730011 57226200.0 336.730011 1.570
6812 2020-02-19 339.079987 337.480011 337.790009 338.339996 48639100.0 338.339996 1.570
6813 2020-02-20 338.640015 333.681702 337.742004 336.894989 63193703.0 336.894989 1.570
(I have no dividend data before 1993-03-19, but I do onward.)
Basically, I want to merge them so that the dividend column is continuous with the dates in the historical dataframe.
Upvotes: 0
Views: 744
Reputation: 1
price_table = price_table.merge(dividend_table, left_on = 'Date', right_on ='Date',how = 'outer')
price.value = price.value.fillna('')
If the dates are not in the same format convert using:
price_table['Date'] = pd.to_datetime(price_table['Date'])
Upvotes: 0
Reputation: 13
You can join, then fill missing values and finally rename columns.
Supose your dataframes are named market_prices
and dividend_data
.
You can do this:
import pandas as pd
# Make sure the keys share type
market_prices['Date'] = pd.to_datetime(market_prices['Date'])
dividend_data.index = pd.to_datetime(dividend_data.index)
# Join the data
prices_with_dividends = market_prices.join(
dividend_data['value'], how='left', on='Date'
)
# Fill missing values with previous data
prices_with_dividends['value'].fillna(method='ffill', inplace=True)
# Rename column
prices_with_dividends.rename({'value': 'Dividend'}, axis=1, inplace=True)
That should do.
Upvotes: 1
Reputation: 323
Interpolated from @fjsh's answer.
# Join the data
prices_with_dividends = features.set_index('Date').join(dividend_data.set_index('Date'))
# Fill missing values with previous data
prices_with_dividends['Value'].fillna(method='ffill', inplace=True)
# Rename column
prices_with_dividends.rename({'Value': 'Dividend'}, axis=1, inplace=True)
Upvotes: 0
Reputation: 13377
Assuming df
is the data frame with historical ticker data, and df2
is the one with dividend information.
#prerequisite:
df["Date"]=pd.to_datetime(df["Date"])
df=df.set_index("Date")
df["Dividend"]=df2["value"]
df=df.reset_index()
Upvotes: 1