MartiniMoe
MartiniMoe

Reputation: 13

Split list into sublists based on string split

I have a list like this:

a = [['cat1.subcat1.item1', 0], ['cat1.subcat1.item2', 'hello], [cat1.subcat2.item1, 1337], [cat2.item1, 'test']]

So there may be several subcategories with items, split by a dot. But the number of categoryies and the level of depth isn't fixed and not equal among the categories.

I want the list to look like this:

a = [['cat1', [
        ['subcat1', [
            ['item1', 0],
            ['item2', 'hello']
        ]],
        ['subcat2', [
            ['item1', 1337]
        ]],
    ]],
    ['cat2', [
        ['item1', 'test']
    ]]
]

I hope this makes sense.

In the end I need a json string out of this. If it is somehow easier it could also directly be converted to the json string.

Any idea how to achieve this? Thanks!

Upvotes: 1

Views: 172

Answers (2)

jpp
jpp

Reputation: 164783

You should use a nested dictionary structure. This can be processed efficiently using collections.defaultdict and functools.reduce.

Conversion to a regular dictionary is possible, though usually not necessary.

Solution

from collections import defaultdict
from functools import reduce
from operator import getitem

def getFromDict(dataDict, mapList):
    """Iterate nested dictionary"""
    return reduce(getitem, mapList, dataDict)

tree = lambda: defaultdict(tree)
d = tree()

for i, j in a:
    path = i.split('.')
    getFromDict(d, path[:-1])[path[-1]] = j

Result

def default_to_regular_dict(d):
    """Convert nested defaultdict to regular dict of dicts."""
    if isinstance(d, defaultdict):
        d = {k: default_to_regular_dict(v) for k, v in d.items()}
    return d

res = default_to_regular_dict(d)

{'cat1': {'subcat1': {'item1': 0,
                      'item2': 'hello'},
          'subcat2': {'item1': 1337}},
 'cat2': {'item1': 'test'}}

Explanation

  • getFromDict(d, path[:-1]) takes a list path[:-1] and recursively accesses dictionary values corresponding to the list items from dictionary d. I've implemented this bit functionally via functools.reduce and operator.getitem.
  • We then access the key path[-1], the last element of the list, from the resulting dictionary tree. This will be a dictionary since d is a defaultdict of dictionaries. We can then assign value j to this dictionary.

Upvotes: 4

Bram Vanroy
Bram Vanroy

Reputation: 28505

Not as pretty as @jpp their solution, but hey at least I tried. Using the merge function to merge deep dicts, as seen in this answer.

def merge(a, b, path=None):
    "merges b into a"
    if path is None: path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass # same leaf value
            else:
                raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
        else:
            a[key] = b[key]
    return a


a = [['cat1.subcat1.item1', 0], ['cat1.subcat1.item2', 'hello'], ['cat1.subcat2.item1', 1337], ['cat2.item1', 'test']]

# convert to dict
b = {x[0]:x[1] for x in a}
res = {}

# iterate over dict
for k, v in list(b.items()):
  s = k.split('.')
  temp = {}
  # iterate over reverse indices,
  # build temp dict from the ground up
  for i in reversed(range(len(s))):
    if i == len(s)-1:
      temp = {s[i]: v}
    else:
      temp = {s[i]: temp}

    # merge temp dict with main dict b
    if i == 0:
      res  = merge(res, temp)
      temp = {}

print(res)
# {'cat1': {'subcat1': {'item1': 0, 'item2': 'hello'}, 'subcat2': {'item1': 1337}}, 'cat2': {'item1': 'test'}}

Upvotes: 1

Related Questions