Reputation: 17
I created a list like this:
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'),
(25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
(26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold'),...]
I want to convert it to a dictionary like this:
bookDict = { 24: {'Start': '2008-10-30', 'End': '2008-12-20','reason':'sold'},
25: {'Start': '2009-01-01', 'End': '2009-11-14','reason':'returned'},
26: {'Start': '2010-04-03', 'End': '2010-10-11','reason':'sold'},...}
For each key in the dictionary which is the first value of tuples in the Book list(it is a code), I want to have two tuples as values of each key. One of them is related to the 'start' point and the other one is related the 'End' point of that specific code.
I have another question as well. For some of the codes There is more than one 'End' point with different dates. I want to keep only the End point with the later date. some thing like this:
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End', 'sold'),
(24, '2009-02-04', 'End', 'sold'), (24, '2009-11-25', 'End', 'sold')]
For above example dictionary should keep this:
bookDict = { 24: {'Start': '2008-10-30', 'End': '2009-11-25','reason':'sold'},
Can anyone help me please?
Upvotes: 0
Views: 323
Reputation: 22324
Here is a solution that satisfies both criterions.
Everytime it encoutners a new book id, it creates a dict
for it and fills it in as it encounters data in your list
.
As for multiple End entries, your date format allows to use string comparison to get the latest date.
books = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'),
(25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
(26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold'),
(26, '2011-10-11', 'End', 'returned')] # The latest 'End' entry should be picked
bookDict = {}
for info in books:
id_ = info[0]
type_ = info[2]
book = bookDict.setdefault(id_, {})
if type_ == 'Start':
book[type_] = info[1]
elif type_ == 'End' and info[1] > book.get(type_, ''):
book[type_] = info[1]
book['reason'] = info[3]
Output:
bookDict
# {24: {'Start': '2008-10-30', 'End': '2008-12-20', 'reason': 'sold'},
# 25: {'Start': '2009-01-01', 'End': '2009-11-14', 'reason': 'returned'},
# 26: {'Start': '2010-04-03', 'End': '2010-10-11', 'reason': 'returned'}}
Upvotes: 0
Reputation: 164773
This answers the first part of OP's question only, although it can be adapted for the second.
You can use collections.defaultdict
for an O(n) solution:
book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'),
(25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
(26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold')]
from collections import defaultdict
d = defaultdict(dict)
for key, date, *data in book:
d[key][data[0]] = date
if len(data) == 2:
d[key]['reason'] = data[1]
Alternatively, you can catch IndexError
instead of testing for tuple length:
for key, date, *data in book:
d[key][data[0]] = date
try:
d[key]['reason'] = data[1]
except IndexError:
continue
Upvotes: 0
Reputation: 1052
You could do something like this:
for t in Book:
index, date, marker, *rest = t
entry = d.setdefault(index, {})
end_date = entry.get("End", "1900-01-01")
if marker == "Start" or date > end_date:
entry[marker] = date
if rest:
entry["reason"] = rest[0]
Upvotes: 0
Reputation: 71461
You can use itertools.groupby
, min
, and max
:
import itertools
def quantity_key(d):
return list(map(int, d[1].split('-')))
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'), (25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'), (26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold')]
new_books = {a:list(b) for a, b in itertools.groupby(Book, key=lambda x:x[0])}
final_books = {a:{'Start':min(b, key=quantity_key)[1], 'End':max(b, key=quantity_key)[1], 'reason':max(b, key=quantity_key)[-1]} for a, b in new_books.items()}
Output:
{24: {'Start': '2008-10-30', 'End': '2008-12-20', 'reason': 'sold'}, 25: {'Start': '2009-01-01', 'End': '2009-11-14', 'reason': 'returned'}, 26: {'Start': '2010-04-03', 'End': '2010-10-11', 'reason': 'sold'}}
With more than two values for each key:
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End', 'sold'), (24, '2009-02-04', 'End', 'sold'), (24, '2009-11-25', 'End', 'sold')]
new_books = {a:list(b) for a, b in itertools.groupby(Book, key=lambda x:x[0])}
final_books = {a:{'Start':min(b, key=quantity_key)[1], 'End':max(b, key=quantity_key)[1], 'reason':max(b, key=quantity_key)[-1]} for a, b in new_books.items()}
Output:
{24: {'Start': '2008-10-30', 'End': '2009-11-25', 'reason': 'sold'}}
Upvotes: 1