Bing
Bing

Reputation: 9

How do you convert start and end date records into timestamps?

For example (input pandas dataframe):

start_date   end_date  value
0 2018-05-17 2018-05-20      4
1 2018-05-22 2018-05-27     12
2 2018-05-14 2018-05-21      8

I want it to divide the value by the # of intervals present in the data (e.g. 2018-05-12 to 2018-05-27 has 6 days, 12 / 6 = 2) and then create a time series data like the following:

date  value
0  2018-05-14      1
1  2018-05-15      1
2  2018-05-16      1
3  2018-05-17      2
4  2018-05-18      2
5  2018-05-19      2
6  2018-05-20      2
7  2018-05-21      1
8  2018-05-22      2
9  2018-05-23      2
10 2018-05-24      2
11 2018-05-25      2
12 2018-05-26      2
13 2018-05-27      2

is this possible to do without an inefficient loop through every row using pandas? Is there also a name for this method?

Upvotes: 1

Views: 72

Answers (1)

jezrael
jezrael

Reputation: 862521

You can use:

#convert to datetimes if necessary
df['start_date'] = pd.to_datetime(df['start_date'])
df['end_date'] = pd.to_datetime(df['end_date'])

For each row generate list of Series by date_range, then divide their length and aggregate by groupby with sum:

dfs = [pd.Series(r.value, pd.date_range(r.start_date, r.end_date)) for r in df.itertuples()]
df = (pd.concat([x / len(x) for x in dfs])
        .groupby(level=0)
        .sum()
        .rename_axis('date')
       .reset_index(name='val'))
print (df)
         date  val
0  2018-05-14  1.0
1  2018-05-15  1.0
2  2018-05-16  1.0
3  2018-05-17  2.0
4  2018-05-18  2.0
5  2018-05-19  2.0
6  2018-05-20  2.0
7  2018-05-21  1.0
8  2018-05-22  2.0
9  2018-05-23  2.0
10 2018-05-24  2.0
11 2018-05-25  2.0
12 2018-05-26  2.0
13 2018-05-27  2.0

Upvotes: 1

Related Questions