Reputation: 129
I have a spreadsheet which has the following content :
start_date | end_date | rating
-----------|----------|--------
01/01/2010 |01/01/2012| 2
01/12/2012 |01/01/2014| 4
-----------|----------|--------
I would like to split the period into years, for instance given the example above, I would like to have the following result :
year | rating
-----|--------
2010 | 2
2011 | 2
2012 | 2
2012 | 4
2013 | 4
2014 | 4
-----|--------
Is it possible to do it via pandas method??
Upvotes: 1
Views: 50
Reputation: 294288
Assuming dates are strings in the same format
pd.DataFrame(
[(y, r) for s, e, r in zip(*map(df.get, df))
for y in range(int(s[-4:]), int(e[-4:]) + 1)],
columns=['year', 'rating']
)
year rating
0 2010 2
1 2011 2
2 2012 2
3 2012 4
4 2013 4
5 2014 4
Similar but using f-string and pd.date_range
pd.DataFrame(
[(y, r) for s, e, r in zip(*map(df.get, df))
for y in pd.date_range(s, f"12/31/{e[-4:]}", freq='Y').year],
columns=['year', 'rating']
)
Upvotes: 2
Reputation: 323276
You can do with date_range
, create the year within the range , the we recreate your dataframe with repeat
yearrange=[ pd.date_range(x,y,freq='YS').year.values.tolist() for x,y in zip(df['start_date'],df['end_date'])]
pd.DataFrame({'year':sum(yearrange,[]),'rating':df.rating.repeat(list(map(len,yearrange)))})
Out[357]:
rating year
0 2 2010
0 2 2011
0 2 2012
1 4 2013
1 4 2014
Upvotes: 2