Reputation: 222751
I have a dictionary of lists:
a = {
'a': [1, 2, 3],
'b': [1, 2, 4],
'c': [1, 2],
'd': [1, 2, 3, 4, 5],
'e': [3],
'f': [3, 7],
'g': [3, 3],
'h': [3, 3, 3, 3, 3],
'i': [3, 3, 3, 3, 4],
}
And I would like to create hierarchical structure from this dictionary which will group items in the similar manner (exact structure does not matter, as well as the relation between elements is preserved):
/ \
/ \
e c
/\ /\
f g a b
/\ |
h i d
The hierarchy goes as follows: array g
is a prefix of array h
and i
and therefore it is their ancestor. But e
is a prefix of g
, so it e
is an ancestor of g
.
Here is my idea how to achieve this result.
s = sorted(a.items(), key=lambda e: len(e[1]))
. This will give me the following structure: .
('e', [3])
('c', [1, 2])
('g', [3, 3])
('f', [3, 7])
('a', [1, 2, 3])
('b', [1, 2, 4])
('d', [1, 2, 3, 4, 5])
('h', [3, 3, 3, 3, 3])
Right now I can find first parents by iterating through elements and checking if an element is a prefix of other elements. Starting with the first one. e
is a prefix of g
, f
, and h
. And c
is a prefix of a
, b
, d
. So these two elements are the parents.
right now I understand that I have to use recursion to enter inside of each parent and to perform the same operation, but I was not able to come up with a right solution.
So does anyone knows how to approach this problem. Or am I over-complicating things and there is an easier way to achieve the solution.
P.S. this is not a homework assignment or interview question (also it might be). This is just my abstraction from a problem I am trying to solve.
Upvotes: 5
Views: 3735
Reputation: 26121
Just sort arrays in lexicographical order:
(c,[1,2]),
(a,[1,2,3]),
(d,[1,2,3,4,5]),
(b,[1,2,4]),
(e,[3]),
(g,[3,3]),
(h,[3,3,3,3,3]),
(i,[3,3,3,3,4]),
(f,[3,7])
Then solution is pretty obvious.
root
Lc
|La
||Ld
|Lb
Le
Lg
|Lh
|Li
Lf
You need only track path form parent by prefix. From previous line. You will form somethink like stack. root
has empty set so push it on stack. c
has (empty) prefix as root so root
is parent of c
. Push c
on stack. a
has prefix which is c
on top of stack so c
is parent of a
. push a
on stack. d
has prefix same as a
on top of stack so a
is parent of d
and push on stack. b
doesn't have prefix d
on top of stack so pop. Same for a
then pop. Now there is c
which is prefix so b
has parent c
. Push b
on stack. And continue in same way.
In Erlang simply:
-module(tree_from_prefix).
-export([tree/1]).
is_prefix(_, []) -> true;
is_prefix([H|A], [H|B]) -> is_prefix(A, B);
is_prefix(_, _) -> false.
tree(L) ->
tree(lists:keysort(2, L), [{root, []}]).
tree([], _) -> [];
tree([{X, L} = Record|T] = List, [{Parent, Prefix}|R] = Stack) ->
case is_prefix(L, Prefix) of
true -> [{Parent, X}|tree(T, [Record|Stack])];
false -> tree(List, R)
end.
And result
1> tree_from_prefix:tree([{e,[3]},{c,[1, 2]},{g,[3, 3]},{f,[3, 7]},{a,[1, 2, 3]},{b, [1, 2, 4]},{d,[1, 2, 3, 4, 5]},{h,[3, 3, 3, 3, 3]},{i,[3, 3, 3, 3, 4]}]).
[{root,c},
{c,a},
{a,d},
{c,b},
{root,e},
{e,g},
{g,h},
{g,i},
{e,f}]
In python it will not be so elegant but same algorithm will work too.
Upvotes: 0
Reputation: 2198
Other people already give the methord, I just write some code here:
First sort:
t = sorted(a.items(), key=lambda x: x[1])
The build the structure
ret = {}
def build(ret, upv):
if not t:
return (None, None)
k, v = t.pop(0)
while k and v:
if upv and v[:len(upv)] != upv:
return (k, v)
r = {}
ret[k] = r
k, v = build(r, v)
return None, None
build(ret, None)
print ret
Upvotes: 1
Reputation: 104762
How about building the tree with a set of nested dictionaries, so that you'd access the e
node by tree[3]
and the h
node by tree[3][3][3][3][3]
:
from collections import nested
def nested():
return defaultdict(nested)
def build_tree(data):
tree = nested()
for name, path in data.items():
d = tree
for p in path:
d = d[p]
d["value"] = name
return tree
Example output:
>>> a = {
'a': [1, 2, 3],
'b': [1, 2, 4],
'c': [1, 2],
'd': [1, 2, 3, 4, 5],
'e': [3],
'f': [3, 7],
'g': [3, 3],
'h': [3, 3, 3, 3, 3],
'i': [3, 3, 3, 3, 4],
}
>>> import json # for pretty printing, note that in python the keys are ints, not str
>>> print(json.dumps(build_tree(a), indent=4))
{
"1": {
"2": {
"3": {
"4": {
"5": {
"value": "d"
}
},
"value": "a"
},
"4": {
"value": "b"
},
"value": "c"
}
},
"3": {
"7": {
"value": "f"
},
"3": {
"3": {
"3": {
"3": {
"value": "h"
},
"4": {
"value": "i"
}
}
},
"value": "g"
},
"value": "e"
}
}
Upvotes: 0
Reputation: 10329
given an object that has a list of children, and an is_prefix function, and your sorted list of objects, I don't see why this wouldn't work
for indx, potential_prefix in enumerate(your_list):
for potential_child in your_list[indx:]:
if is_prefix(potential_prefix, potential_child):
potential_prefix.add_child(potential_child)
# and optionally
potential_child.add_parent(potential_prefix)
Upvotes: 0