user5421875
user5421875

Reputation:

error while bitting timestamps into custom periods

I tried to bit timestamps into custom periods , and I suppose I use not very well the method, proposed in answer for this question Bin timestamp into custom periods Here is my code :

def assigne_time():
    morning_start=datetime.time(7)
    morning_end = datetime.time(12)
    noon_start=datetime.time(12)
    noon_end=datetime.time(14)
    afternoon_start=datetime.time(14)
    afternoon_end=datetime.time(18)
    evening_start = datetime.time(18)
    evening_end = datetime.time(23)
    night_start=datetime.time(23)
    night_end=datetime.time(7)
    periods = {'morning':[morning_start, morning_end], 'noon':[noon_start, noon_end],'afternoon':[afternoon_start, afternoon_end],'evening':[evening_start, evening_end], 'night':[night_start, night_end]}
    for k, v in periods.items():
        df['periods'] = np.where(((v[0].hour <= df.Time.apply(lambda x: x.hour)) & (df.Time.apply(lambda x: x.hour) <= v[1].hour)), k, 'unknown_period')
    return df.to_csv('assigne_time.csv', sep='\t')

So it assigns only 'morning' period, and for others assigns 'unknown_period':

05:51:53    2015-05-22  unknown_period
05:52:59    2015-05-22  unknown_period
06:08:24    2015-05-22  unknown_period
06:09:06    2015-05-22  unknown_period
08:25:31    2015-05-22  morning
08:25:35    2015-05-22  morning
08:26:37    2015-05-22  morning
08:27:11    2015-05-22  morning
08:33:17    2015-05-22  morning
08:33:45    2015-05-22  morning

Upvotes: 0

Views: 116

Answers (1)

Romain Endelin
Romain Endelin

Reputation: 148

That's normal, you ask for something that is both higher than 23 and lower than 7. You need to use modulo, with the formula (current - begin) % 24 <= (end - begin) % 24.

Another problem you will face, is that you reassign df['periods'] at every iteration on periods; every changes you make are overwritten by the next iteration. It is much more straightforward to create a column, and assign periods using the loc method.

df['periods'] = 'unknown_period'
for k, v in periods.items():
    begin = v[0].hour
    end = v[1].hour
    df.loc[(df.Time.apply(lambda x: x.hour) - begin) % 24 <= (end - begin) % 24, 'periods'] = k

Upvotes: 1

Related Questions