Reputation: 1667
I have a dataframe that holds quarterly earnings dates for a number stocks, looking like this:
df = pd.DataFrame( {
'ticker': ['AAPL', 'AAPL', 'AAPL','AAPL', 'MSFT', 'MSFT','MSFT', 'MSFT'],
'datetime': ['2015-01-01', '2015-04-01', '2015-07-01', '2015-12-01', '2015-01-01', '2015-04-01', '2015-07-01', '2015-12-01'],
})
df['datetime'] = pd.to_datetime(df['datetime'])
I would like to create an additional n rows before and after each date, to make it look like this (for n=1):
df = pd.DataFrame( {
'ticker': ['AAPL', 'AAPL', 'AAPL','AAPL','AAPL', 'AAPL', 'AAPL','AAPL','AAPL', 'AAPL', 'AAPL','AAPL', 'MSFT', 'MSFT','MSFT', 'MSFT','MSFT', 'MSFT','MSFT', 'MSFT','MSFT', 'MSFT','MSFT', 'MSFT'],
'datetime': ['2014-12-31','2015-01-01','2015-01-02','2015-03-30','2015-04-01', '2015-04-02', '2015-06-30','2015-07-01','2015-07-02', '2015-11-30','2015-12-01', '2015-12-02',
'2014-12-31','2015-01-01','2015-01-02','2015-03-30','2015-04-01', '2015-04-02', '2015-06-30','2015-07-01','2015-07-02', '2015-11-30','2015-12-01', '2015-12-02'],
})
df['datetime'] = pd.to_datetime(df['datetime'])
Upvotes: 1
Views: 49
Reputation: 195653
Substract and add to each date one day and then .explode()
:
t = pd.Timedelta(days=1)
df["datetime"] = df["datetime"].apply(lambda x: [x - t, x, x + t])
df = df.explode("datetime").reset_index(drop=True)
print(df)
Prints:
ticker datetime
0 AAPL 2014-12-31
1 AAPL 2015-01-01
2 AAPL 2015-01-02
3 AAPL 2015-03-31
4 AAPL 2015-04-01
5 AAPL 2015-04-02
6 AAPL 2015-06-30
7 AAPL 2015-07-01
8 AAPL 2015-07-02
9 AAPL 2015-11-30
10 AAPL 2015-12-01
11 AAPL 2015-12-02
12 MSFT 2014-12-31
13 MSFT 2015-01-01
14 MSFT 2015-01-02
15 MSFT 2015-03-31
16 MSFT 2015-04-01
17 MSFT 2015-04-02
18 MSFT 2015-06-30
19 MSFT 2015-07-01
20 MSFT 2015-07-02
21 MSFT 2015-11-30
22 MSFT 2015-12-01
23 MSFT 2015-12-02
EDIT: For n > 1:
n = 2
df["datetime"] = df["datetime"].apply(
lambda x: [x + pd.Timedelta(days=d) for d in range(-n, n)]
)
df = df.explode("datetime").reset_index(drop=True)
print(df)
Prints:
ticker datetime
0 AAPL 2014-12-30
1 AAPL 2014-12-31
2 AAPL 2015-01-01
3 AAPL 2015-01-02
4 AAPL 2015-03-30
5 AAPL 2015-03-31
...
Upvotes: 2
Reputation: 29635
you can use DateOffset
and concat
like:
n = 1 # nb of rows to add
res = (pd.concat([df.assign(datetime = df['datetime'] + pd.DateOffset(days=i))
for i in range(-n, n+1)])
.sort_values(['ticker','datetime'])
.reset_index(drop=True)
)
print(res.head(10))
ticker datetime
0 AAPL 2014-12-31
1 AAPL 2015-01-01
2 AAPL 2015-01-02
3 AAPL 2015-03-31
4 AAPL 2015-04-01
5 AAPL 2015-04-02
6 AAPL 2015-06-30
7 AAPL 2015-07-01
8 AAPL 2015-07-02
9 AAPL 2015-11-30
Upvotes: 2