tumultous_rooster
tumultous_rooster

Reputation: 12550

Reformatting a dict where the values have a dict-like relationship

I have a defaultdict that looks like this:

d = { 'ID_001': ['A', 'A_part1', 'A_part2'], 
      'ID_002': ['A', 'A_part3'],
      'ID_003': ['B', 'B_part1', 'B_part2', 'A', 'A_part4'],
      'ID_004': ['C', 'C_part1', 'A', 'A_part5', 'B', 'B_part3']
    }

Before I go any further, I have to say that A_part1 isn't the actual string -- the strings are really a bunch of alphanumeric characters; I represented it as such to show that A_part1 is text that is associated with A, if you see what I mean.)

Standing back and looking at it, what I really have is a dict where the values have their own key/value relationship, but that relationship exists only in the order they appear in, in the list.

I am attempting to end up with something like this:

['ID_001 A A_part1, A_part2',
 'ID_002 A A_part3',
 'ID_003 B B_part1 B_part2',
 'ID_003 A A_part4',
 'ID_004 C C_part1',
 'ID_004 A A_part5',
 'ID_004 B B_part3']

I have made a variety of attempts; I keep wanting to run through the dict's value, making note of the character in the first position (eg, the A), and collect values until I find a B or a C, then stop collecting. Then append what I have to a list that I have declared elsewhere. Ad nauseum.

I'm running into all sorts of problems, not the least of which is bloated code. I'm missing the ability to iterate through the value in a clean way. Invariably, I seem to run into index errors.

If anyone has any ideas/philosophy/comments I'd be grateful.

Upvotes: 0

Views: 39

Answers (3)

DSM
DSM

Reputation: 353009

Whenever you're trying to do something involving contiguous groups, you should think of itertools.groupby. You weren't very specific about what condition separates the groups, but if we take "the character in the first position" at face value:

from itertools import groupby

new_list = []
for key, sublist in sorted(d.items()):
    for _, group in groupby(sublist, key=lambda x: x[0]):
        new_list.append(' '.join([key] + list(group)))

produces

>>> for elem in new_list:
...     print(elem)
...     
ID_001 A A_part1 A_part2
ID_002 A A_part3
ID_003 B B_part1 B_part2
ID_003 A A_part4
ID_004 C C_part1
ID_004 A A_part5
ID_004 B B_part3

Upvotes: 0

Ch.Idea
Ch.Idea

Reputation: 588

May not be in the order you want, but no thanks for further headaches.

d = { 'ID_001': ['A', 'A_part1', 'A_part2'], 
      'ID_002': ['A', 'A_part3'],
      'ID_003': ['B', 'B_part1', 'B_part2', 'A', 'A_part4'],
      'ID_004': ['C', 'C_part1', 'A', 'A_part5', 'B', 'B_part3']
    }
rst = []
for o in d:
    t_d={}

    for t_o in d[o]:
        if not t_o[0] in t_d:
            t_d[t_o[0]] = [t_o]
        else: t_d[t_o[0]].append(t_o)
    for t_o in t_d:
        rst.append(' '.join([o,t_d[t_o][0],', '.join(t_d[t_o][1:])]))
print(rst)

https://ideone.com/FeBDLA

['ID_004 C C_part1', 'ID_004 A A_part5', 'ID_004 B B_part3', 'ID_003 A A_part4', 'ID_003 B B_part1, B_part2', 'ID_002 A A_part3', 'ID_001 A A_part1, A_part2']

Upvotes: 0

jedwards
jedwards

Reputation: 30200

What about something like:

d = { 'ID_001': ['A', 'A_part1', 'A_part2'],
      'ID_002': ['A', 'A_part3'],
      'ID_003': ['B', 'B_part1', 'B_part2', 'A', 'A_part4'],
      'ID_004': ['C', 'C_part1', 'A', 'A_part5', 'B', 'B_part3']
    }

def is_key(s):
    return s in ['A','B','C']

out = {}
for (k,v) in d.iteritems():
    key = None
    for e in v:
        if is_key(e): key = e
        else:
            out_key = (k,key)
            out[out_key] = out.get(out_key, []) + [e]

which generates:

{('ID_001', 'A'): ['A_part1', 'A_part2'],
 ('ID_002', 'A'): ['A_part3'],
 ('ID_003', 'A'): ['A_part4'],
 ('ID_003', 'B'): ['B_part1', 'B_part2'],
 ('ID_004', 'A'): ['A_part5'],
 ('ID_004', 'B'): ['B_part3'],
 ('ID_004', 'C'): ['C_part1']}

It's important that you update the is_key function to match your actual input.

Also, the variable names are far from optimal, but I'm not really sure what you're doing -- you should be able to (and should) give them more appropriate names.

Upvotes: 1

Related Questions