Reputation: 4322
I know this has been asked before and I've seen the answers but still can't figure out what is happening.
I'm trying to conditionally build folder structures based on certain metadata of files (dates and locations) and a set of conditions. For example, for testing I'm using these:
COND = ["Y", "m", "C"]
Which means that in the folder structure files need to first split files by year, then calendar month, then country of origin.
This is the example data I created for testing:
data = [
["111", dt.datetime(2019, 1, 1), "Aus", "Bri"],
["112", dt.datetime(2019, 1, 5), "Aus", "Bri"],
["113", dt.datetime(2019, 2, 10), "Aus", "Mel"],
["114", dt.datetime(2020, 1, 1), "Aus", "Per"],
["115", dt.datetime(2020, 1, 10), "Aus", "Per"],
["116", dt.datetime(2020, 1, 25), "Aus", "Per"],
["117", dt.datetime(2020, 10, 5), "My", "KL"],
["118", dt.datetime(2020, 11, 6), "Ru", "Led"],
["119", dt.datetime(2020, 12, 1), "Ru", "Mos"],
["120", dt.datetime(2021, 3, 5), "Aus", "Syd"],
["121", dt.datetime(2021, 5, 1), "Aus", "Mel"],
["122", dt.datetime(2021, 6, 1), "Aus", "Per"],
["123", dt.datetime(2021, 11, 1), "Chi", "Bei"],
["124", dt.datetime(2021, 11, 15), "Jp", "Tok"],
["125", dt.datetime(2022, 1, 1), "Aus", "Per"],
["126", dt.datetime(2022, 3, 1), "Aus", "Bri"],
["127", dt.datetime(2022, 3, 5), "Aus", "Per"],
["128", dt.datetime(2022, 3, 11), "My", "KL"],
["129", dt.datetime(2022, 5, 1), "Aus", "Syd"],
["130", dt.datetime(2022, 8, 8), "Aus", "Bri"],
]
And these simple functions perform filtering:
def filter_year(data: list[list[str | dt.datetime]]) -> list[int]:
return {i[1].year for i in data}
def filter_month(data: list[list[str | dt.datetime]]) -> list[int]:
return {i[1].month for i in data}
def filter_day(data: list[list[str | dt.datetime]]) -> list[int]:
return {i[1].day for i in data}
def filter_country(data: list[list[str | dt.datetime]]) -> list[str]:
return {i[2] for i in data}
def filter_city(data: list[list[str | dt.datetime]]) -> list[str]:
return {i[3] for i in data}
condition_dict = {
"Y": {'fun': filter_year, 'id': 1 },
"m": {'fun': filter_month,'id': 1 },
"d": {'fun': filter_day,'id': 1},
"C": {'fun': filter_country, 'id': 2},
"c": {'fun': filter_city, 'id': 3 }
I'm trying to build structure automatically using an arbitrary order tree. The splitting of data at the Node works correctly:
from typing import Any
from pathlib import Path
from dataclasses import dataclass, field
@dataclass
class Node:
folder: Path
metadata: list[list[Any]] = field(default_factory=list)
conditions: list[str] = field(default_factory=list)
@property
def children(self) -> list['Node']:
if len(self.conditions) == 0:
return []
current_condition = self.conditions[0]
fun = condition_dict[current_condition]['fun']
fnames: list[int | str] = fun(self.metadata)
children_data = {str(n): {} for n in fnames}
for f in fnames:
children_data[str(f)]['folder'] = self.folder / str(f)
children_data[str(f)]['conditions'] = self.conditions[1:]
if current_condition == 'Y':
children_data[str(f)]['metadata'] = [i for i in self.metadata if i[1].year == f]
elif current_condition == 'm':
children_data[str(f)]['metadata'] = [i for i in self.metadata if i[1].month == f]
elif current_condition == 'd':
children_data[str(f)]['metadata'] = [i for i in self.metadata if i[1].day == f]
elif current_condition == 'C':
children_data[str(f)]['metadata'] = [i for i in self.metadata if i[2] == f]
elif current_condition == 'c':
children_data[str(f)]['metadata'] = [i for i in self.metadata if i[3] == f]
return [Node(**i) for i in children_data.values()]
Now, I'm trying to traverse the tree for which I used a modified version from the answer here (Traverse Non-Binary Tree)
@dataclass
class Tree:
def traverse(self, root: Node):
r = root.children
if not r or len(root.conditions) == 0:
print('The end of subtree:', root.folder)
else:
for child in r:
print('\n'.join(str(i.folder) for i in r))
if isinstance(child, Node):
for x in self.traverse(child):
print(str(x.folder))
else:
print(child)
But when I try with my data after a few correct outputs I always run into errors NoneType is not iterable
:
n = Node(folder=Path('/home'), metadata=data, conditions=COND)
tree = Tree()
tree.traverse(n)
Output:
/home/2019
/home/2020
/home/2021
/home/2022
/home/2019/1
/home/2019/2
/home/2019/1/Aus
The end of subtree: /home/2019/1/Aus
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/home/pavel/python/photo_manager/temp/tree_test.ipynb Cell 4 in <cell line: 4>()
1 n = Node(folder=Path('/home'), metadata=data, conditions=COND)
3 tree = Tree()
----> 4 tree.traverse(n)
/home/pavel/python/photo_manager/temp/tree_test.ipynb Cell 4 in Tree.traverse(self, root)
45 print('\n'.join(str(i.folder) for i in r))
46 if isinstance(child, Node):
---> 47 for x in self.traverse(child):
48 print(str(x.folder))
49 else:
/home/pavel/python/photo_manager/temp/tree_test.ipynb Cell 4 in Tree.traverse(self, root)
45 print('\n'.join(str(i.folder) for i in r))
46 if isinstance(child, Node):
---> 47 for x in self.traverse(child):
48 print(str(x.folder))
49 else:
/home/pavel/python/photo_manager/temp/tree_test.ipynb Cell 4 in Tree.traverse(self, root)
45 print('\n'.join(str(i.folder) for i in r))
46 if isinstance(child, Node):
---> 47 for x in self.traverse(child):
48 print(str(x.folder))
49 else:
TypeError: 'NoneType' object is not iterable
I don't understand why this is happening as I believe I guarded against NoneType. For some reason I'm only getting to the end of one subtree but not traversing the others. What am I doing wrong here?
Upvotes: 0
Views: 130
Reputation: 350272
I didn't really follow the whole story, but the error you get on this line is expected:
for x in self.traverse(child):
The thing is that self.traverse
doesn't have a return
statement so this recursive call returns None
, and for x in None
makes no sense.
I think you actually don't want to get some x
values from that recursive call, since that recursive call takes care of its own business. There is no need to print again what is already printed by that recursive call.
There is a second issue here:
for child in r:
print('\n'.join(str(i.folder) for i in r))
Here, for each child in r
, you iterate r
again in the print
call. That will just print duplicates. You need to just print the current child from r
. And that would make the else
block below it obsolete: when you just have printed child.folder
it seems unnecessary to print child
again.
So correcting both issues, the following at least runs without error:
@dataclass
class Tree:
def traverse(self, root: Node):
r = root.children
if not r or len(root.conditions) == 0:
print('The end of subtree:', root.folder)
else:
for child in r:
print(str(child.folder))
if isinstance(child, Node):
self.traverse(child)
Upvotes: 1