Reputation: 673
I have a dataset like the below:
Category Date 1 Date 2
a 2017-01-01 2017-08-01
What I want to be able to do is to transpose this data so I have a record for each month between the two dates e.g.
Category Date
a 2017-01-01
a 2017-02-01
a 2017-03-01
.....
a 2017-08-01
I need to do this in python and my data is already in a pandas dataframe. Dates are in datestamps format YYYY-MM-DD
Upvotes: 4
Views: 1135
Reputation: 294348
Use a comprehension with pd.date_range
pd.DataFrame(
[[c, d] for c, d1, d2 in df.itertuples(index=False)
for d in pd.date_range(d1, d2, freq='MS')],
columns=['Category', 'Date']
)
Category Date
0 a 2017-01-01
1 a 2017-02-01
2 a 2017-03-01
3 a 2017-04-01
4 a 2017-05-01
5 a 2017-06-01
6 a 2017-07-01
7 a 2017-08-01
If you have more than 3 columns and you want just Category
and Date
pd.DataFrame(
[[c, d] for c, d1, d2, *_ in df.itertuples(index=False)
for d in pd.date_range(d1, d2, freq='MS')],
columns=['Category', 'Date']
)
The *_
unpacks the rest of the tuple.
Otherwise, we could trap the entire tuple and just grab the bits we need.
pd.DataFrame(
[[t[0], d] for t in df.itertuples(index=False)
for d in pd.date_range(t[1], t[2], freq='MS')],
columns=['Category', 'Date']
)
Upvotes: 6
Reputation: 323306
I Am not sure about the datetime format it is m-d-y or d-m-y ...also you can replace the resample('D')
to resample('MS')
get the month start.
df.melt('Category').set_index('value').resample('D').first().ffill().drop('variable',1)
Out[31]:
Category
value
2017-01-01 a
2017-01-02 a
2017-01-03 a
2017-01-04 a
2017-01-05 a
2017-01-06 a
2017-01-07 a
2017-01-08 a
Change to MS
df.melt('Category').set_index('value').resample('MS').first().ffill().drop('variable',1)
Out[40]:
Category
value
2017-01-01 a
2017-02-01 a
2017-03-01 a
2017-04-01 a
2017-05-01 a
2017-06-01 a
2017-07-01 a
2017-08-01 a
Upvotes: 3