Tess
Tess

Reputation: 129

break periods into years

I have a spreadsheet which has the following content :

start_date | end_date | rating
-----------|----------|--------
01/01/2010 |01/01/2012| 2
01/12/2012 |01/01/2014| 4
-----------|----------|--------

I would like to split the period into years, for instance given the example above, I would like to have the following result :

year | rating 
-----|--------
2010 | 2
2011 | 2
2012 | 2
2012 | 4
2013 | 4
2014 | 4
-----|--------

Is it possible to do it via pandas method??

Upvotes: 1

Views: 50

Answers (2)

piRSquared
piRSquared

Reputation: 294288

Comprehension

Assuming dates are strings in the same format

pd.DataFrame(
    [(y, r) for s, e, r in zip(*map(df.get, df))
     for y in range(int(s[-4:]), int(e[-4:]) + 1)],
    columns=['year', 'rating']
)

   year  rating
0  2010       2
1  2011       2
2  2012       2
3  2012       4
4  2013       4
5  2014       4

Similar but using f-string and pd.date_range

pd.DataFrame(
    [(y, r) for s, e, r in zip(*map(df.get, df))
     for y in pd.date_range(s, f"12/31/{e[-4:]}", freq='Y').year],
    columns=['year', 'rating']
)

Upvotes: 2

BENY
BENY

Reputation: 323276

You can do with date_range, create the year within the range , the we recreate your dataframe with repeat

yearrange=[ pd.date_range(x,y,freq='YS').year.values.tolist() for x,y in zip(df['start_date'],df['end_date'])]
pd.DataFrame({'year':sum(yearrange,[]),'rating':df.rating.repeat(list(map(len,yearrange)))})
Out[357]: 
   rating  year
0       2  2010
0       2  2011
0       2  2012
1       4  2013
1       4  2014

Upvotes: 2

Related Questions