filippo.burne
filippo.burne

Reputation: 21

Mongo DB aggregation pipeline: convert string to document/object

Have a field of type "String" that contain representation of an object/document

" {"a":35,b:[1,2,3,4]}"

I know is a strange construct but i can't change it. my goal would be to extract for example the value of "a". As the document represented by the string are nested and repeated a regex doesnt fit. So how can i convert in a mongo db aggregation/query this String to object so that i can process it in a following aggregation step? (could extract string with python make a dict and extract infos, but i'd like to stay inside the aggregation pipeline and so having better performance)

Upvotes: 1

Views: 9854

Answers (3)

mark d drake
mark d drake

Reputation: 1505

In 4.4 this works

db.target.aggregate([{$project: { 
                         X: "$AD_GRAPHIC",
                         Y : {
                           $function : {
                               body: function(jsonString) {
                                 return JSON.parse(jsonString)
                               },
                               args: [ "$AD_GRAPHIC"],
                               lang: "js"
                           }
                         }
                     }
                    }])

Basically use the $function operator to invoke the JSON parser. (assumes you have enabled Javascript)

Results

{ "_id" : ObjectId("60093dc8f2c829000e38a8d0"), "X" : "{\"alias\":\"MEDIA_DIR\",\"path\":\"modem.jpg\"}", "Y" : { "alias" : "MEDIA_DIR", "path" : "modem.jpg" } }
{ "_id" : ObjectId("60093dc8f2c829000e38a8d1"), "X" : "{\"alias\":\"MEDIA_DIR\",\"path\":\"monitor.jpg\"}", "Y" : { "alias" : "MEDIA_DIR", "path" : "monitor.jpg" } }
{ "_id" : ObjectId("60093dc8f2c829000e38a8d2"), "X" : "{\"alias\":\"MEDIA_DIR\",\"path\":\"mousepad.jpg\"}", "Y" : { "alias" : "MEDIA_DIR", "path" : "mousepad.jpg" } }
{ "_id" : ObjectId("60093dc8f2c829000e38a8d3"), "X" : "{\"alias\":\"MEDIA_DIR\",\"path\":\"keyboard.jpg\"}", "Y" : { "alias" : "MEDIA_DIR", "path" : "keyboard.jpg" } }
>

Upvotes: 11

Kevin Smith
Kevin Smith

Reputation: 14436

There's no native way in the MongoDB engine to parse a blob of JSON from a field. However, I'd recommend just doing it client-side in your language of choice and then if required save it back.

Alternatively, if your data is too big and still needs to aggregate it you could use regex and project out the required fields from the JSON to then use them later to filter etc...

For example if we insert the following document:

> db.test.insertOne({ name: 'test', blob: '{"a":35,b:[1,2,3,4]}' })
{
        "acknowledged" : true,
        "insertedId" : ObjectId("5ed9fe21b5d91941c9e85cdb")
}

We can then just project out the array with some regex:

db.test.aggregate([
   { $addFields: { b: { $regexFind: { input: "$blob", regex: /\[(((\d+,*))+)\]/ }  } } },
   { $addFields: { b: { $split: [ { $arrayElemAt: [ "$b.captures", 0 ] }, "," ] } } }
]);

{
        "_id" : ObjectId("5ed9fe21b5d91941c9e85cdb"),
        "name" : "test",
        "blob" : "{\"a\":35,b:[1,2,3,4]}",
        "b" : [
                "1",
                "2",
                "3",
                "4"
        ]
}

This means we can do some filtering, sorting and any of the other aggregation stages.

Upvotes: 0

Konrad Klockgether
Konrad Klockgether

Reputation: 405

You could just use JSON.parse()

For example

db.getCollection('system').find({
    a: JSON.parse('{"a":35,b:[1,2,3,4]}').a
})

Upvotes: -2

Related Questions