Reputation: 67
list1 = [['8/16/2016 9:55', 6], ['11/22/2015 13:43', 29], ['5/2/2016 10:14', 1],
['8/2/2016 14:20', 3], ['10/15/2015 16:38', 17], ['9/26/2015 23:23', 1],
['4/22/2016 12:24', 4], ['11/16/2015 9:22', 1], ['2/24/2016 17:57', 1],
['6/4/2016 17:17', 2]]
count_by_hour = {} # this is created by extracting the hour from index[0] of list1
for each in list1:
if each[0].split(':')[0][-2] == " ": #split by : to get second last char and check if >9
hours.append(each[0].split(':')[0][-1:]) # if hour is <9 take last char which is hour
else:
hours.append(each[0].split(':')[0][-2:]) #else take last 2 chars
print('Hour extracted:')
print(hours)
Output:
Counts by hour:
{'9': 2, '13': 1, '10': 1, '14': 1, '16': 1, '23': 1, '12': 1, '17': 2}
Now, how do I do the following:
comments_by_hour = {}
Expected Outcome:
{9:7, 13:29, 10:1, 14:3, 16:17, 23:1, 12:4, 17:2} #value is a total for every hour that exists as a key in list1
As always, any help is appreciated.
Upvotes: 0
Views: 342
Reputation: 945
Notice that we need to accumulate the sum separately for each of many categories (hours). A simple solution (in pure Python) combines the accumulator pattern while using a dictionary to store all the counts.
First, let's use time.strptime
to extract the hours using a list comprehension.
In [1]: list1 = [['8/16/2016 9:55', 6], ['11/22/2015 13:43', 29], ['5/2/2016 10:14', 1],
: ['8/2/2016 14:20', 3], ['10/15/2015 16:38', 17], ['9/26/2015 23:23', 1],
: ['4/22/2016 12:24', 4], ['11/16/2015 9:22', 1], ['2/24/2016 17:57', 1],
: ['6/4/2016 17:17', 2]]
In [2]: from time import strptime
In [3]: hour_list = [(strptime(time, "%m/%d/%Y %H:%M").tm_hour, val) for time, val in list1]
The solution is to use a dictionary to accumulate statistics for each category. Do this by (a) starting with an empty dictionary and (b) updating the sums for each new value. This can be accomplished as follows.
In [4]: comments_by_hour = {}
In [5]: for hour, val in hour_list:
: comments_by_hour[hour] = val + comments_by_hour.get(hour, 0)
:
In [6]: comments_by_hour
Out[6]: {9: 7, 13: 29, 10: 1, 14: 3, 16: 17, 23: 1, 12: 4, 17: 3}
Note that comments_by_hour.get(hour, 0)
used to get the current value for that hour, if it exists, or using the default value of 0 otherwise.
Upvotes: 1