Reputation: 192
Python 3.6.5/3.7.1 on Linux
Struggling to create a dictionary with dictionaries as values.
I want to create a dictionary from a list date & time data (ultimately to create charts with bokeh).
This must have been asked before, but I can't find a set of search terms that returns a result that clarifies matters for me.
nb I'm essentially a hobby coder, & I don't easily think algorithmically like a real programmer.
The data is in a list (max 3200 items): Each item is a record of the occurrence of an event on a date in a clock period of one hour.
Thus; ['03/01/19 09:00', '03/01/19 09:00', '03/01/19 09:00',]
indicates 3 events between 0900-1000 on 03/01/2019.
Only clock periods with events are recorded, so if no event, no timestamp.
nb date format is ddmmyy
Example data:
dtl = [
'06/01/19 12:00', '06/01/19 12:00', '06/01/19 11:00', '05/01/19 21:00',
'05/01/19 17:00', '05/01/19 17:00', '05/01/19 14:00', '03/01/19 21:00',
'03/01/19 17:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00',
'03/01/19 12:00', '03/01/19 12:00', '03/01/19 11:00', '03/01/19 10:00',
'03/01/19 10:00', '03/01/19 09:00','03/01/19 09:00','03/01/19 09:00',
]
The desired dictionary would look like this:
dtd = {
'03/01/19': {
'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0,
'06': 0, '07': 0, '08': 0, '09': 3, '10': 2, '11': 1,
'12': 5, '13': 0, '14': 0, '15': 0, '16': 0, '17': 1,
'18': 0, '19': 0, '20': 0, '21': 1, '22': 0, '23': 0,
},
'04/01/19': {
'00': 0, ... '23': 0
},
'05/01/19': {
'00': 0, ...
} ... etc
}
Clearly I can initialise a dictionary with at least the keys:
{i.split()[0]:{} for i in dtl}
But then I can't get my head round what I need to do to update the subdicts with the counts, & so can't see a way to get from the original list to the desired dictionary. I'm going round in circles!
Upvotes: 1
Views: 147
Reputation: 114548
You could combine a Counter
with a defaultdict
to do this pretty effectively once you have split into a dictionary by date. So first split by date:
from collections import Counter, defaultdict
dtd = defaultdict(list)
for date, time in (item.split() for item in dtl):
dtd[date].append(time[:2])
Now you can easily count the existing items, and use them to initialize a defaultdict
that will return zeros for the missing times:
for key in dtd:
dtd[key] = defaultdict(int, Counter(dtd[key]))
The result is:
defaultdict(list, {
'03/01/19': defaultdict(int, {
'09': 3,
'10': 2,
'11': 1,
'12': 5,
'17': 1,
'21': 1
}),
'05/01/19': defaultdict(int, {'14': 1, '17': 2, '21': 1}),
'06/01/19': defaultdict(int, {'11': 1, '12': 2})
})
Since the objects here are defaultdict
s, you will be able to query dates and times that were not in the original dataset. You can avoid this by converting the result to a regular dict
containing only the keys you want after you finish:
hours = ['%02d' % h for h in range(24)]
dtd = {date: {h: d[h] for h in hours} for date, d in dtd}
Upvotes: 2
Reputation: 483
I'd suggest the use of collections.defaultdict
since some of your counts can be 0.
Here's an option:
from collections import defaultdict
dtl = ['06/01/19 12:00', '06/01/19 12:00', '06/01/19 11:00',
'05/01/19 21:00', '05/01/19 17:00', '05/01/19 17:00',
'05/01/19 14:00', '03/01/19 21:00', '03/01/19 17:00',
'03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00',
'03/01/19 12:00', '03/01/19 12:00', '03/01/19 11:00',
'03/01/19 10:00', '03/01/19 10:00', '03/01/19 09:00',
'03/01/19 09:00','03/01/19 09:00',]
# Nested defaultdict
result = defaultdict(lambda: defaultdict(int))
for date_time in dtl:
date, time = date_time.split()
result[date][time.split(':')[0]] += 1
Output (using pprint
):
defaultdict(<function <lambda> at 0x7f20d5c37c80>,
{'03/01/19': defaultdict(<class 'int'>,
{'09': 3,
'10': 2,
'11': 1,
'12': 5,
'17': 1,
'21': 1}),
'05/01/19': defaultdict(<class 'int'>,
{'14': 1,
'17': 2,
'21': 1}),
'06/01/19': defaultdict(<class 'int'>, {'12': 2, '11': 1})})
If you really want to show the 0
for printing then I don't really see a way around keeping an array of times
as I've done here and initializing your dict
that way.
times = ['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '10',
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21',
'22', '23']
dtl = ['06/01/19 12:00', '06/01/19 12:00', '06/01/19 11:00',
'05/01/19 21:00', '05/01/19 17:00', '05/01/19 17:00',
'05/01/19 14:00', '03/01/19 21:00', '03/01/19 17:00',
'03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00',
'03/01/19 12:00', '03/01/19 12:00', '03/01/19 11:00',
'03/01/19 10:00', '03/01/19 10:00', '03/01/19 09:00',
'03/01/19 09:00','03/01/19 09:00']
result = {date_time.split()[0] : {time : 0 for time in times} for date_time in dtl}
for date_time in dtl:
date, time = date_time.split()
result[date][time.split(':')[0]] += 1
Output below:
{'06/01/19': {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0, '06': 0, '07': 0, '08': 0, '09': 0, '10': 0, '11': 1, '12': 2, '13': 0, '14': 0, '15': 0, '16': 0, '17': 0, '18': 0, '19': 0, '20': 0, '21': 0, '22': 0, '23': 0}, '05/01/19': {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0, '06': 0, '07': 0, '08': 0, '09': 0, '10': 0, '11': 0, '12': 0, '13': 0, '14': 1, '15': 0, '16': 0, '17': 2, '18': 0, '19': 0, '20': 0, '21': 1, '22': 0, '23': 0}, '03/01/19': {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0, '06': 0, '07': 0, '08': 0, '09': 3, '10': 2, '11': 1, '12': 5, '13': 0, '14': 0, '15': 0, '16': 0, '17': 1, '18': 0, '19': 0, '20': 0, '21': 1, '22': 0, '23': 0}}
Upvotes: 2
Reputation: 172
One quick and dirty way is this:
#!/usr/bin/env python3
def convert(dt):
ret = {}
for elem in dt:
d,t = elem.split()
t = t.split(":")[0]
# not a valid value
if not d: pass
# we inserted d already
if d in ret:
if t in ret[d]:
ret[d][t] += 1
else:
ret[d] = {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0,
'06': 0, '07': 0, '08': 0, '09': 0, '10': 0, '11': 0,
'12': 0, '13': 0, '14': 0, '15': 0, '16': 0, '17': 0,
'18': 0, '19': 0, '20': 0, '21': 0, '22': 0, '23': 0 }
return ret
dtl = ['06/01/19 12:00', '06/01/19 12:00', '06/01/19 11:00', '05/01/19 21:00', '05/01/19 17:00', '05/01/19 17:00', '05/01/19 14:00', '03/01/19 21:00', '03/01/19 17:00','03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 11:00', '03/01/19 10:00', '03/01/19 10:00', '03/01/19 09:00','03/01/19 09:00','03/01/19 09:00']
print(convert(dtl))
Upvotes: 0