philshem
philshem

Reputation: 25341

Assign a day-of-week to ordered but incomplete list of hours

I'm using Python3 to generate a list of hours and frequencies over the course of a week:

t = '''06 0
    07 0
    08 0
    09 0
    10 14
    11 25
    12 37
    13 49
    14 56
    15 57
    16 55
    17 53
    18 53
    19 50
    20 40
    21 26
    22 12
    23 0
    06 0
    07 0
    08 3
    09 6
    10 10
    11 13
    12 15
    13 16
    14 15
    15 14
    16 16
    17 23
    18 35
    19 47
    20 50
    21 41
    22 31
    23 11
    06 0
    07 0
    08 9
    09 16
    10 19
    11 19
    12 17
    13 16
    14 15
    15 15
    16 17
    17 24
    18 33
    19 41
    20 42
    21 36
    22 25
    23 13
    06 0
    07 0
    08 1
    09 3
    10 7
    11 10
    12 13
    13 15
    14 17
    15 19
    16 23
    17 30
    18 41
    19 51
    20 54
    21 47
    22 33
    23 18
    06 0
    07 0
    08 3
    09 6
    10 10
    11 14
    12 17
    13 18
    14 18
    15 18
    16 20
    17 29
    18 45
    19 59
    20 64
    21 55
    22 37
    23 18
    06 0
    07 0
    08 5
    09 9
    10 12
    11 16
    12 19
    13 22
    14 24
    15 26
    16 28
    17 34
    18 46
    19 64
    20 81
    21 86
    22 75
    23 53
    00 29
    01 12
    06 0
    07 0
    08 0
    09 0
    10 10
    11 18
    12 28
    13 38
    14 48
    15 55
    16 60
    17 65
    18 75
    19 89
    20 100
    21 97
    22 78
    23 53
    00 30
    01 15'''

t = [x.split() for x in t.split('\n')]
print(t)

I know the data is stored in sequence for days of the week, starting on Sunday. This means the first item, [06,0] corresponds to zero frequency at the hour of 6am on Sunday. I can infer that then next time 06 appears, that is for Monday. I can't be sure that each day starts at a certain hour.

How can I transform this data to include the day of the week, for example:

Sunday 06 0
Sunday 07 0
Sunday 08 0
Sunday 09 0
Sunday 10 14
Sunday 11 25
...

Usually I post some code that I've tried, but this time I'm really stumped. I'll update the question if I make any progress.

I've tried looping over the array and assigning a day-of-week

hours_list = []
for row in t:
    hour = row[0]
    if hour in hours_list:
        dow += 1
        hours_list = []

    print(dow, hour, freq)
    hours_list.append(hour)

which is working, although seems too much brute force. It's also missing that the tailing 01 am should be assigned to the next day.

Upvotes: 0

Views: 62

Answers (3)

philshem
philshem

Reputation: 25341

I ended up using a code based on this comment

t = '''06 0
    07 0
    08 0
    09 0
    10 14
    11 25
    12 37
    13 49
    14 56
    15 57
    16 55
    17 53
    18 53
    19 50
    20 40
    21 26
    22 12
    23 0
    06 0
    07 0
    08 3
    09 6
    10 10
    11 13
    12 15
    13 16
    14 15
    15 14
    16 16
    17 23
    18 35
    19 47
    20 50
    21 41
    22 31
    23 11
    06 0
    07 0
    08 9
    09 16
    10 19
    11 19
    12 17
    13 16
    14 15
    15 15
    16 17
    17 24
    18 33
    19 41
    20 42
    21 36
    22 25
    23 13
    06 0
    07 0
    08 1
    09 3
    10 7
    11 10
    12 13
    13 15
    14 17
    15 19
    16 23
    17 30
    18 41
    19 51
    20 54
    21 47
    22 33
    23 18
    06 0
    07 0
    08 3
    09 6
    10 10
    11 14
    12 17
    13 18
    14 18
    15 18
    16 20
    17 29
    18 45
    19 59
    20 64
    21 55
    22 37
    23 18
    06 0
    07 0
    08 5
    09 9
    10 12
    11 16
    12 19
    13 22
    14 24
    15 26
    16 28
    17 34
    18 46
    19 64
    20 81
    21 86
    22 75
    23 53
    00 29
    01 12
    06 0
    07 0
    08 0
    09 0
    10 10
    11 18
    12 28
    13 38
    14 48
    15 55
    16 60
    17 65
    18 75
    19 89
    20 100
    21 97
    22 78
    23 53
    00 30
    01 15'''

tt = [x.split() for x in t.split('\n')]


hour = 0
dow = 0

for t in tt:

    hour_prev = hour

    hour = int(t[0])
    freq = int(t[1])

    if hour < hour_prev:
        # increment the day if the hour decreases
        dow += 1

    print([dow, hour, freq])

Upvotes: 0

Alain T.
Alain T.

Reputation: 42133

You can use zip to detect breaks in days based on the hour going up or down. Then accumulate these breaks to obtain a day number matching each position:

from itertools import accumulate
dow    = ["Saturday","Sunday","Monday","Tuesday","Wednesday","Thursday","Friday"]
breaks = ( a[0]>b[0] for a,b in zip((("24",0),*t),t) )
days   = ( dow[d%7] for d in accumulate(breaks) )
result = ( (day,hour,freq) for day,(hour,freq) in zip(days,t) )
for day,hour,freq in result: print(day,hour,freq)

output:

Sunday 06 0
Sunday 07 0
Sunday 08 0
Sunday 09 0
Sunday 10 14
Sunday 11 25
Sunday 12 37
Sunday 13 49
Sunday 14 56
Sunday 15 57
Sunday 16 55
Sunday 17 53
Sunday 18 53
Sunday 19 50
Sunday 20 40
Sunday 21 26
Sunday 22 12
Sunday 23 0
Monday 06 0
Monday 07 0
Monday 08 3
Monday 09 6
Monday 10 10
Monday 11 13
...

Upvotes: 1

Seb
Seb

Reputation: 4586

The first thought that came to me was to use np.unwrap. It is designed for angles, so we have to convert the 0..24 range to 0..2pi and back:

>>> import numpy as np
>>> hrs = np.array(t)[:, 0].astype(int)
>>> 24*np.unwrap(2*np.pi*hrs/24)/(2*np.pi)
array([  6.,   7.,   8.,   9.,  10.,  11.,  12.,  13.,  14.,  15.,  16.,
        17.,  18.,  19.,  20.,  21.,  22.,  23.,  30.,  31.,  32.,  33.,
        34.,  35.,  36.,  37.,  38.,  39.,  40.,  41.,  42.,  43.,  44.,
        45.,  46.,  47.,  54.,  55.,  56.,  57.,  58.,  59.,  60.,  61.,
        62.,  63.,  64.,  65.,  66.,  67.,  68.,  69.,  70.,  71.,  78.,
        79.,  80.,  81.,  82.,  83.,  84.,  85.,  86.,  87.,  88.,  89.,
        90.,  91.,  92.,  93.,  94.,  95., 102., 103., 104., 105., 106.,
       107., 108., 109., 110., 111., 112., 113., 114., 115., 116., 117.,
       118., 119., 126., 127., 128., 129., 130., 131., 132., 133., 134.,
       135., 136., 137., 138., 139., 140., 141., 142., 143., 144., 145.,
       150., 151., 152., 153., 154., 155., 156., 157., 158., 159., 160.,
       161., 162., 163., 164., 165., 166., 167., 168., 169.])

Those are the hours counted from the start of the week, without rolling over at midnight. Now the modulus can get us the indices of the weekdays:

>>> weekdays, h = np.divmod((24*np.unwrap(2*np.pi*hrs/24)/(2*np.pi)).astype(int), 24)
>>> weekdays
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
       3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
       4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6,
       6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7])
>>> weekdays % 7 # to make the days roll around every week
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
       3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
       4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6,
       6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 0, 0])

Edit:

A much simpler one:

>>> np.cumsum(np.diff(hrs, prepend=0) < 0) % 7
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
       3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
       4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6,
       6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 0, 0])

Upvotes: 2

Related Questions