Salvador Dali
Salvador Dali

Reputation: 222751

Create a hierarchy from a dictionary of lists

I have a dictionary of lists:

a = {
        'a': [1, 2, 3],
        'b': [1, 2, 4],
        'c': [1, 2],
        'd': [1, 2, 3, 4, 5],
        'e': [3],
        'f': [3, 7],
        'g': [3, 3],
        'h': [3, 3, 3, 3, 3],
        'i': [3, 3, 3, 3, 4],
    }

And I would like to create hierarchical structure from this dictionary which will group items in the similar manner (exact structure does not matter, as well as the relation between elements is preserved):

              /  \
             /    \
            e      c
           /\      /\
          f  g    a  b
             /\   |
            h  i  d

The hierarchy goes as follows: array g is a prefix of array h and i and therefore it is their ancestor. But e is a prefix of g, so it e is an ancestor of g.

Here is my idea how to achieve this result.

.

('e', [3])
('c', [1, 2])
('g', [3, 3])
('f', [3, 7])
('a', [1, 2, 3])
('b', [1, 2, 4])
('d', [1, 2, 3, 4, 5])
('h', [3, 3, 3, 3, 3])

So does anyone knows how to approach this problem. Or am I over-complicating things and there is an easier way to achieve the solution.

P.S. this is not a homework assignment or interview question (also it might be). This is just my abstraction from a problem I am trying to solve.

Upvotes: 5

Views: 3735

Answers (4)

Hynek -Pichi- Vychodil
Hynek -Pichi- Vychodil

Reputation: 26121

Just sort arrays in lexicographical order:

(c,[1,2]),
(a,[1,2,3]),
(d,[1,2,3,4,5]),
(b,[1,2,4]),
(e,[3]),
(g,[3,3]),
(h,[3,3,3,3,3]),
(i,[3,3,3,3,4]),
(f,[3,7])

Then solution is pretty obvious.

root
Lc
|La
||Ld
|Lb
Le
 Lg
 |Lh
 |Li
 Lf

You need only track path form parent by prefix. From previous line. You will form somethink like stack. root has empty set so push it on stack. c has (empty) prefix as root so root is parent of c. Push c on stack. a has prefix which is c on top of stack so c is parent of a. push a on stack. d has prefix same as a on top of stack so a is parent of d and push on stack. b doesn't have prefix d on top of stack so pop. Same for a then pop. Now there is c which is prefix so b has parent c. Push b on stack. And continue in same way.

In Erlang simply:

-module(tree_from_prefix).

-export([tree/1]).

is_prefix(_, []) -> true;
is_prefix([H|A], [H|B]) -> is_prefix(A, B);
is_prefix(_, _) -> false.

tree(L) ->
  tree(lists:keysort(2, L), [{root, []}]).

tree([], _) -> [];
tree([{X, L} = Record|T] = List, [{Parent, Prefix}|R] = Stack) ->
  case is_prefix(L, Prefix) of
    true -> [{Parent, X}|tree(T, [Record|Stack])];
    false -> tree(List, R)
  end.

And result

1> tree_from_prefix:tree([{e,[3]},{c,[1, 2]},{g,[3, 3]},{f,[3, 7]},{a,[1, 2, 3]},{b, [1, 2, 4]},{d,[1, 2, 3, 4, 5]},{h,[3, 3, 3, 3, 3]},{i,[3, 3, 3, 3, 4]}]).
[{root,c},
 {c,a},
 {a,d},
 {c,b},
 {root,e},
 {e,g},
 {g,h},
 {g,i},
 {e,f}]

In python it will not be so elegant but same algorithm will work too.

Upvotes: 0

PasteBT
PasteBT

Reputation: 2198

Other people already give the methord, I just write some code here:

First sort:

t = sorted(a.items(), key=lambda x: x[1])

The build the structure

ret = {}

def build(ret, upv):
    if not t:
        return (None, None)
    k, v = t.pop(0)
    while k and v:
        if upv and v[:len(upv)] != upv:
            return (k, v)
        r = {}
        ret[k] = r
        k, v = build(r, v)
    return None, None

build(ret, None)
print ret

Upvotes: 1

Blckknght
Blckknght

Reputation: 104762

How about building the tree with a set of nested dictionaries, so that you'd access the e node by tree[3] and the h node by tree[3][3][3][3][3]:

from collections import nested

def nested():
    return defaultdict(nested)

def build_tree(data):
    tree = nested()
    for name, path in data.items():
        d = tree
        for p in path:
            d = d[p]
        d["value"] = name
    return tree

Example output:

>>> a = {
    'a': [1, 2, 3],
    'b': [1, 2, 4],
    'c': [1, 2],
    'd': [1, 2, 3, 4, 5],
    'e': [3],
    'f': [3, 7],
    'g': [3, 3],
    'h': [3, 3, 3, 3, 3],
    'i': [3, 3, 3, 3, 4],
}

>>> import json # for pretty printing, note that in python the keys are ints, not str
>>> print(json.dumps(build_tree(a), indent=4))
{
    "1": {
        "2": {
            "3": {
                "4": {
                    "5": {
                        "value": "d"
                    }
                }, 
                "value": "a"
            }, 
            "4": {
                "value": "b"
            }, 
            "value": "c"
        }
    }, 
    "3": {
        "7": {
            "value": "f"
        }, 
        "3": {
            "3": {
                "3": {
                    "3": {
                        "value": "h"
                    }, 
                    "4": {
                        "value": "i"
                    }
                }
            }, 
            "value": "g"
        }, 
        "value": "e"
    }
}

Upvotes: 0

Hammer
Hammer

Reputation: 10329

given an object that has a list of children, and an is_prefix function, and your sorted list of objects, I don't see why this wouldn't work

for indx, potential_prefix in enumerate(your_list):
    for potential_child in your_list[indx:]:
        if is_prefix(potential_prefix, potential_child):
            potential_prefix.add_child(potential_child)
            # and optionally
            potential_child.add_parent(potential_prefix)

Upvotes: 0

Related Questions