Reputation: 479
I have a pandas dataframe that looks like this:
import pandas as pd
d = {'date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08', '2021-01-09', '2021-01-10']}
df = pd.DataFrame(data=d)
df
date
0 2021-01-01
1 2021-01-02
2 2021-01-03
3 2021-01-04
4 2021-01-05
5 2021-01-06
6 2021-01-07
7 2021-01-08
8 2021-01-09
9 2021-01-10
I would to add a new column out
to this df to indicate when a wedding started with the value 0. The rows after the start date should increment while the rows before the start date should decrement. For example, if the wedding started on '2021-01-05', my desired output is this:
date out
0 2021-01-01 -5
1 2021-01-02 -4
2 2021-01-03 -3
3 2021-01-04 -2
4 2021-01-05 -1
5 2021-01-06 0
6 2021-01-07 1
7 2021-01-08 2
8 2021-01-09 3
9 2021-01-10 4
Upvotes: 0
Views: 74
Reputation: 30050
If your date
column doesn't duplicate, you can try
df['time'] = range(len(df))
df['time'] = df['time'] - df.set_index('date').loc['2021-01-06', 'time']
print(df)
date time
0 2021-01-01 -5
1 2021-01-02 -4
2 2021-01-03 -3
3 2021-01-04 -2
4 2021-01-05 -1
5 2021-01-06 0
6 2021-01-07 1
7 2021-01-08 2
8 2021-01-09 3
9 2021-01-10 4
Or
df['time'] = df.index.values - df['date'].tolist().index('2021-01-06')
Upvotes: 1
Reputation: 323356
Doing
df['out'] = (pd.to_datetime(df.date) - pd.to_datetime('2021-01-06')).dt.days
Out[20]:
0 -5
1 -4
2 -3
3 -2
4 -1
5 0
6 1
7 2
8 3
9 4
Name: date, dtype: int64
Upvotes: 3
Reputation: 3046
You can use cumcount() to get the expected output posted
import pandas as pd
d = {'date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08', '2021-01-09', '2021-01-10']}
df = pd.DataFrame(data=d)
df['Control'] = 1
date_lookback_location = df.loc[df['date'] == '2021-01-06'].index.tolist()[0]
df['time'] = df.sort_values(['date'], ascending=True).groupby(['Control']).cumcount() -date_lookback_location
df[['date', 'time']]
Upvotes: 0