Reputation: 2329
I'm developing an application with a hierarchical data structure in django-rest-framework
using django-treebeard
. My (simplified) main model looks like this
class Task(MP_Node):
name = models.CharField(_('name'), max_length=64)
started = models.BooleanField(default=True)
What I'm currently trying to achieve is a list view of all root nodes which shows extra fields (such as whether all children have started). To do this I specified a view:
class TaskViewSet(viewsets.ViewSet):
def retrieve(self, request, pk=None):
queryset = Task.get_tree().filter(depth=1, job__isnull=True)
operation = get_object_or_404(queryset, pk=pk)
serializer = TaskSerializer(operation)
return Response(serializer.data)
and serializer
class TaskSerializer(serializers.ModelSerializer):
are_children_started = serializers.SerializerMethodField()
def get_are_children_started(self, obj):
return all(task.started for task in Task.get_tree(obj))
This all works and I get the expected results. However, I run into a N+1 query problem where for each root task I need to fetch all children separately. Normally this would be solvable using prefetch_related
but as I use the Materialized Path structure from django-treebeard
there are no Django relationships between the task models, so prefetch_related
doesn't know what to do out of the box. I've tried to use custom Prefetch objects but as this still requires a Django relation path I could not get it to work.
My current idea is to extend the Task model with a foreign key pointing to its root node like such:
root_node = models.ForeignKey('self', null=True,
related_name='descendant_tasks',
verbose_name=_('root task')
)
in order to make the MP relationship explicit so it can be queried. However, this does feel like a bit of a non-dry method of doing it so I wonder whether anyone has another suggestion on how to tackle it.
Upvotes: 3
Views: 3259
Reputation: 5644
If you want to avoid using a Foreign Key you can iterate over the queryset and re-create the tree structure in memory.
In my case I wanted to have a template tag (much like django-mptt's recursetree
templatetag) to show multiple levels of nested pages with only one database query. Basically copying mptt.utils.get_cached_trees
I ended up with this:
def get_cached_trees(queryset: QuerySet) -> list:
"""Return top-most pages / roots.
Each page will have its children stored in `_cached_children` attribute
and its parent in `_cached_parent`. This avoids having to query the database.
"""
top_nodes: list = []
path: list = []
for obj in queryset:
obj._cached_children = []
if obj.depth == queryset[0].depth:
add_top_node(obj, top_nodes, path)
else:
while not is_child_of(obj, parent := path[-1]):
path.pop()
add_child(parent, obj)
if obj.numchild:
path.append(obj)
return top_nodes
def add_top_node(obj: MP_Node, top_nodes: list, path: list) -> None:
top_nodes.append(obj)
path.clear()
def add_child(parent: MP_Node, obj: MP_Node) -> None:
obj._cached_parent = parent
parent._cached_children.append(obj)
def is_child_of(child: MP_Node, parent: MP_Node) -> bool:
"""Return whether `child` is a sub page of `parent` without database query.
`_get_children_path_interval` is an internal method of MP_Node.
"""
start, end = parent._get_children_path_interval(parent.path)
return start < child.path < end
It can be used like this to avoid the dreaded N+1 query problem:
for page in get_cached_trees(queryset):
for child in page._cached_children:
...
Upvotes: 1
Reputation: 2329
In the end I did end up with adding a foreign key to each task pointing to its root node like such:
root_node = models.ForeignKey('self', null=True,
related_name='descendant_tasks',
verbose_name=_('root task')
)
I updated my save method on my Task model to make sure I always point to the correct root node
def save(self, force_insert=False, force_update=False, using=None, update_fields=None):
try:
self.root_task = self.get_root()
except ObjectDoesNotExist:
self.root_task = None
return super(Task, self).save(force_insert=False, force_update=False, using=None,
update_fields=None
)
and this allows me to simply prefetch all descendants using prefetch_related('descendants')
.
Whenever I need to have the descendants in a nested fashion I use the following function to nest the flattened list of descendants again
def build_nested(tasks):
def get_basepath(path, depth):
return path[0:depth * Task.steplen]
container, link = [], {}
for task in sorted(tasks, key=attrgetter('depth')):
depth = int(len(task.path) / Task.steplen)
try:
parent_path = get_basepath(task.path, depth - 1)
parent_obj = link[parent_path]
if not hasattr(parent_obj, 'sub_tasks'):
parent_obj.sub_tasks = []
parent_obj.sub_tasks.append(task)
except KeyError: # Append it as root task if no parent exists
container.append(task)
link[task.path] = task
return container
Upvotes: 3