MahsaAzizi
MahsaAzizi

Reputation: 17

Python: converting a list of tuples to dictionary with some conditions

I created a list like this:

Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'), 
 (25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
 (26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold'),...]

I want to convert it to a dictionary like this:

bookDict = { 24: {'Start': '2008-10-30', 'End': '2008-12-20','reason':'sold'},
  25: {'Start': '2009-01-01', 'End': '2009-11-14','reason':'returned'},
  26: {'Start': '2010-04-03', 'End': '2010-10-11','reason':'sold'},...}

For each key in the dictionary which is the first value of tuples in the Book list(it is a code), I want to have two tuples as values of each key. One of them is related to the 'start' point and the other one is related the 'End' point of that specific code.

I have another question as well. For some of the codes There is more than one 'End' point with different dates. I want to keep only the End point with the later date. some thing like this:

Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End', 'sold'), 
 (24, '2009-02-04', 'End', 'sold'), (24, '2009-11-25', 'End', 'sold')]

For above example dictionary should keep this:

bookDict = { 24: {'Start': '2008-10-30', 'End': '2009-11-25','reason':'sold'},

Can anyone help me please?

Upvotes: 0

Views: 323

Answers (4)

Olivier Melançon
Olivier Melançon

Reputation: 22324

Here is a solution that satisfies both criterions.

Everytime it encoutners a new book id, it creates a dict for it and fills it in as it encounters data in your list.

As for multiple End entries, your date format allows to use string comparison to get the latest date.

books = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'),
 (25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
 (26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold'),
 (26, '2011-10-11', 'End', 'returned')] # The latest 'End' entry should be picked

bookDict = {}

for info in books:
    id_ = info[0]
    type_ = info[2]

    book = bookDict.setdefault(id_, {})

    if type_ == 'Start':
        book[type_] = info[1]

    elif type_ == 'End' and info[1] > book.get(type_, ''):
        book[type_] = info[1]
        book['reason'] = info[3]

Output:

bookDict
# {24: {'Start': '2008-10-30', 'End': '2008-12-20', 'reason': 'sold'},
#  25: {'Start': '2009-01-01', 'End': '2009-11-14', 'reason': 'returned'},
#  26: {'Start': '2010-04-03', 'End': '2010-10-11', 'reason': 'returned'}}

Upvotes: 0

jpp
jpp

Reputation: 164773

This answers the first part of OP's question only, although it can be adapted for the second.

You can use collections.defaultdict for an O(n) solution:

book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'), 
        (25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
        (26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold')]

from collections import defaultdict

d = defaultdict(dict)

for key, date, *data in book:
    d[key][data[0]] = date
    if len(data) == 2:
        d[key]['reason'] = data[1]

Alternatively, you can catch IndexError instead of testing for tuple length:

for key, date, *data in book:
    d[key][data[0]] = date
    try:
        d[key]['reason'] = data[1]
    except IndexError:
        continue

Upvotes: 0

Hatatister
Hatatister

Reputation: 1052

You could do something like this:

for t in Book:
    index, date, marker, *rest = t
    entry = d.setdefault(index, {})
    end_date = entry.get("End", "1900-01-01")
    if marker == "Start" or date > end_date:
        entry[marker] = date
        if rest:
            entry["reason"] = rest[0]

Upvotes: 0

Ajax1234
Ajax1234

Reputation: 71461

You can use itertools.groupby, min, and max:

import itertools
def quantity_key(d):
  return list(map(int, d[1].split('-')))

Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'), (25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'), (26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold')]
new_books = {a:list(b) for a, b in itertools.groupby(Book, key=lambda x:x[0])}
final_books = {a:{'Start':min(b, key=quantity_key)[1], 'End':max(b, key=quantity_key)[1], 'reason':max(b, key=quantity_key)[-1]} for a, b in new_books.items()}

Output:

{24: {'Start': '2008-10-30', 'End': '2008-12-20', 'reason': 'sold'}, 25: {'Start': '2009-01-01', 'End': '2009-11-14', 'reason': 'returned'}, 26: {'Start': '2010-04-03', 'End': '2010-10-11', 'reason': 'sold'}}

With more than two values for each key:

Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End', 'sold'), (24, '2009-02-04', 'End', 'sold'), (24, '2009-11-25', 'End', 'sold')]
new_books = {a:list(b) for a, b in itertools.groupby(Book, key=lambda x:x[0])}
final_books = {a:{'Start':min(b, key=quantity_key)[1], 'End':max(b, key=quantity_key)[1], 'reason':max(b, key=quantity_key)[-1]} for a, b in new_books.items()}

Output:

{24: {'Start': '2008-10-30', 'End': '2009-11-25', 'reason': 'sold'}}

Upvotes: 1

Related Questions