Reputation: 191
I have a list of directories extracted from os.walk
. I removed the files because I don't need them.
.
|____A
|____G
|____H
|____K
|____L
|____B
|____I
|____J
|____C
|____D
|____E
|____F
|____M
So it looks like this:
['.', ['A', 'B', 'C', 'D', 'E', 'F']], ['A', ['G', 'H']], ['A\\G', []], ['A\\H', ['K', 'L']], ['A\\G\\K', []], ['A\\G\\L', []], ['B', ['I', 'J']], ['B\\I', []], ['B\\J', []], ['C', []], ['D', []], ['E', []], ['F', ['M']], ['F\\M', []]
What I actually need is a real representation of the tree structure in a list like so:
['.', ['A' ['G', 'H' ['K', 'L']], ['B' ['I', 'J']], 'C', 'D', 'E', 'F' ['M']]
ty ;)
Upvotes: 2
Views: 721
Reputation: 46899
this does not return the datatype you are looking for but a nested dictionary (as this feels more natural to me as a tree structure):
from collections import defaultdict
lst = (['.', ['A', 'B', 'C', 'D', 'E', 'F']],
['A', ['G', 'H']], ['A\\G', []],
['A\\H', ['K', 'L']], ['A\\G\\K', []], ['A\\G\\L', []],
['B', ['I', 'J']], ['B\\I', []], ['B\\J', []], ['C', []],
['D', []], ['E', []], ['F', ['M']], ['F\\M', []])
def rec_dd():
""""recursive default dict"""
return defaultdict(rec_dd)
tree = rec_dd()
for here, dirs in lst:
if not here.startswith('.'):
cur_tree = tree['.']
else:
cur_tree = tree
for key in here.split('\\'):
cur_tree = cur_tree[key]
for d in dirs:
cur_tree[d] = rec_dd()
you can pretty print it this way:
import json
print(json.dumps(tree, sort_keys=True, indent=4))
and the result is:
{
".": {
"A": {
"G": {
"K": {},
"L": {}
},
"H": {
"K": {},
"L": {}
}
},
"B": {
"I": {},
"J": {}
},
"C": {},
"D": {},
"E": {},
"F": {
"M": {}
}
}
}
Upvotes: 1
Reputation: 71461
You can construct a dictionary from the flattened values, and then use recursion:
import re
d = ['.', ['A', 'B', 'C', 'D', 'E', 'F']], ['A', ['G', 'H']], ['A\\G', []], ['A\\H', ['K', 'L']], ['A\\G\\K', []], ['A\\G\\L', []], ['B', ['I', 'J']], ['B\\I', []], ['B\\J', []], ['C', []], ['D', []], ['E', []], ['F', ['M']], ['F\\M', []]
new_d = {re.findall('.$', a)[0]:b for a, b in d}
def _tree(_start):
if not new_d[_start]:
return _start
_c = [_tree(i) for i in new_d[_start]]
return [_start, *(_c if any(not isinstance(i, str) for i in _c) else [_c])]
print(_tree('.'))
Output:
['.', ['A', 'G', ['H', ['K', 'L']]], ['B', ['I', 'J']], 'C', 'D', 'E', ['F', ['M']]]
Edit: Python2 version:
import re
d = ['.', ['A', 'B', 'C', 'D', 'E', 'F']], ['A', ['G', 'H']], ['A\\G', []], ['A\\H', ['K', 'L']], ['A\\G\\K', []], ['A\\G\\L', []], ['B', ['I', 'J']], ['B\\I', []], ['B\\J', []], ['C', []], ['D', []], ['E', []], ['F', ['M']], ['F\\M', []]
new_d = {re.findall('.$', a)[0]:b for a, b in d}
def _tree(_start):
if not new_d[_start]:
return _start
_c = [_tree(i) for i in new_d[_start]]
return [_start]+(_c if any(not isinstance(i, str) for i in _c) else [_c])
print(_tree('.'))
Upvotes: 3