hw-135
hw-135

Reputation: 172

Query nested JSON document in MongoDB collection using Python

I have a MongoDB collection containing multiple documents. A document looks like this:

{
    'name': 'sys',
    'type': 'system',
    'path': 'sys',
    'children': [{
        'name': 'folder1',
        'type': 'folder',
        'path': 'sys/folder1',
        'children': [{
            'name': 'folder2',
            'type': 'folder',
            'path': 'sys/folder1/folder2',
            'children': [{
                'name': 'textf1.txt',
                'type': 'file',
                'path': 'sys/folder1/folder2/textf1.txt',
                'children': ['abc', 'def']
            }, {
                'name': 'textf2.txt',
                'type': 'file',
                'path': 'sys/folder1/folder2/textf2.txt',
                'children': ['a', 'b', 'c']
            }]
        }, {
            'name': 'text1.txt',
            'type': 'file',
            'path': 'sys/folder1/text1.txt',
            'children': ['aaa', 'bbb', 'ccc']
        }]
    }],
    '_id': ObjectId('5d1211ead866fc19ccdf0c77')
}

There are other documents containing similar structure. How can I query this collection to find part of one document among multiple documents where path matches sys/folder1/text1.txt?

My desired output would be:

{
   'name': 'text1.txt',
   'type': 'file',
   'path': 'sys/folder1/text1.txt',
   'children': ['aaa', 'bbb', 'ccc']
 }

EDIT: What I have come up with so far is this. My Flask endpoint:

class ExecuteQuery(Resource):
    def get(self, collection_name):
        result_list = []  # List to store query results
        query_list = []  # List to store the incoming queries
        for k, v in request.json.items():
            query_list.append({k: v})  # Store query items in list
        cursor = mongo.db[collection_name].find(*query_list)  # Execute query
        for document in cursor:
            encoded_data = JSONEncoder().encode(document)  # Encode the query results to String
            result_list.append(json.loads(encoded_data))  # Update dict by iterating over Documents
        return result_list  # Return query result to client

My client side:

request = {"name": "sys"}
response = requests.get(url, json=request, headers=headers) 
print(response.text)

This gives me the entire document but I cannot extract a specific part of the document by matching the path.

Upvotes: 1

Views: 1256

Answers (1)

Markus Rother
Markus Rother

Reputation: 434

I don't think mongodb supports recursive or deep queries within a document (neither recursive $unwind). What it does provide however, are recursive queries across documents referencing another, i.e. aggregating elements from a graph ($graphLookup).

This answer explains pretty well, what you need to do to query a tree.

Although it does not directly address your problem, you may want to reevaluate your data structure. It certainly is intuitive, but updates can be painful -- as well as queries for nested elements, as you just noticed.

Since $graphLookup allows you to create a view equal to your current document, I cannot think of any advantages the explicitly nested structure has over one document per path. There will be a slight performance loss for reading and writing the entire tree, but with proper indexing it should be ok.

Upvotes: 2

Related Questions