eduardo_praun
eduardo_praun

Reputation: 51

Change value of a column in a multi index dataframe

I have a dataframe like this:

                        holiday  
YEAR MONTH DAY TIME
2012 10    2   00:00:00       0
               06:00:00       0    
               12:00:00       0
               18:00:00       0
2012 10    3   00:00:00       1
               06:00:00       0
               12:00:00       0
               18:00:00       0
2012 10    4   00:00:00       0
               06:00:00       0
               12:00:00       0
               18:00:00       0

Where the 0 mean that the day is not a holiday and 1 that it is a holiday. However, the 1 only comes in the 00:00:00 hour and I want to replace all 0's on that day to 1's.

                        holiday  
YEAR MONTH DAY TIME
2012 10    2   00:00:00       0
               06:00:00       0    
               12:00:00       0
               18:00:00       0
2012 10    3   00:00:00       1
               06:00:00       1
               12:00:00       1
               18:00:00       1
2012 10    4   00:00:00       0
               06:00:00       0
               12:00:00       0
               18:00:00       0

Any idea on how this can be done?

Upvotes: 1

Views: 92

Answers (3)

BENY
BENY

Reputation: 323226

Let us do

df['holiday']=df.groupby(level=[0,1,2]).cumsum().values

Upvotes: 2

piRSquared
piRSquared

Reputation: 294198

reindex

df.max(level=[0, 1, 2]).reindex(df.index)

                         holiday
YEAR MONTH DAY TIME             
2012 10    2   00:00:00        0
               06:00:00        0
               12:00:00        0
               18:00:00        0
           3   00:00:00        1
               06:00:00        1
               12:00:00        1
               18:00:00        1
           4   00:00:00        0
               06:00:00        0
               12:00:00        0
               18:00:00        0

Setup

idx = pd.MultiIndex.from_product([
    [2012], [10], [2, 3, 4],
    pd.timedelta_range(0, periods=4, freq='6H')
], names=['YEAR', 'MONTH', 'DAY', 'TIME'])

df = pd.DataFrame({'holiday': [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]}, idx)

Upvotes: 1

Quang Hoang
Quang Hoang

Reputation: 150725

Groupby is a good idea:

# transform('any') or transform('first') also work
df['holiday'] = df.groupby(['YEAR','MONTH','DAY']).holiday.transform('max')

Upvotes: 2

Related Questions