Reputation: 51
I have a dataframe like this:
holiday
YEAR MONTH DAY TIME
2012 10 2 00:00:00 0
06:00:00 0
12:00:00 0
18:00:00 0
2012 10 3 00:00:00 1
06:00:00 0
12:00:00 0
18:00:00 0
2012 10 4 00:00:00 0
06:00:00 0
12:00:00 0
18:00:00 0
Where the 0 mean that the day is not a holiday and 1 that it is a holiday. However, the 1 only comes in the 00:00:00 hour and I want to replace all 0's on that day to 1's.
holiday
YEAR MONTH DAY TIME
2012 10 2 00:00:00 0
06:00:00 0
12:00:00 0
18:00:00 0
2012 10 3 00:00:00 1
06:00:00 1
12:00:00 1
18:00:00 1
2012 10 4 00:00:00 0
06:00:00 0
12:00:00 0
18:00:00 0
Any idea on how this can be done?
Upvotes: 1
Views: 92
Reputation: 294198
reindex
df.max(level=[0, 1, 2]).reindex(df.index)
holiday
YEAR MONTH DAY TIME
2012 10 2 00:00:00 0
06:00:00 0
12:00:00 0
18:00:00 0
3 00:00:00 1
06:00:00 1
12:00:00 1
18:00:00 1
4 00:00:00 0
06:00:00 0
12:00:00 0
18:00:00 0
idx = pd.MultiIndex.from_product([
[2012], [10], [2, 3, 4],
pd.timedelta_range(0, periods=4, freq='6H')
], names=['YEAR', 'MONTH', 'DAY', 'TIME'])
df = pd.DataFrame({'holiday': [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]}, idx)
Upvotes: 1
Reputation: 150725
Groupby is a good idea:
# transform('any') or transform('first') also work
df['holiday'] = df.groupby(['YEAR','MONTH','DAY']).holiday.transform('max')
Upvotes: 2