PlsWork
PlsWork

Reputation: 2168

Query nested BSON documents with mongo c++ driver

I have a bsoncxx::document::view bsonObjViewand a std::vector<std::string> path that represents keys to the value we are searching for in the BSON document (first key is top level, second key is depth 1, third key depth 2, etc).

I'm trying to write a function that given a path will search the bson document:

bsoncxx::document::element deepFieldAccess(bsoncxx::document::view bsonObj, const std::vector<std::string>& path) {

    assert (!path.empty());

    // for each key, find the corresponding value at the current depth, next keys will search the value (document) we found at the current depth
    for (auto const& currKey: path) {

        // get value for currKey
        bsonObj = bsonObj.find(currKey);

    }

    // for every key in the path we found a value at the appropriate level, we return the final value we found with the final key
    return bsonObj;
}

How to make the function work? What type should bsonObjbe to allow for such searches within a loop? Also, how to check if a value for currKey has been found?

Also, is there some bsoncxx built in way to do this?

Here is an example json document followed by some paths that point to values inside of it. The final solution should return the corresponding value when given the path:

{
  "shopper": {
    "Id": "4973860941232342",
    "Context": {
      "CollapseOrderItems": false,
      "IsTest": false
    }
  },
  "SelfIdentifiersData": {
    "SelfIdentifierData": [
      {
        "SelfIdentifierType": {
          "SelfIdentifierType": "111"
        }
      },
      {
        "SelfIdentifierType": {
          "SelfIdentifierType": "2222"
        }
      }
    ]
  }
}

Example paths:

The path [ shopper -> Id -> targetValue ] points to the string "4973860941232342".

The path [ SelfIdentifiersData -> SelfIdentifierData -> array_idx: 0 -> targetValue ] points to the object { "SelfIdentifierType": { "SelfIdentifierType": "111" } }.

The path [ SelfIdentifiersData -> SelfIdentifierData -> array_idx: 0 -> SelfIdentifierType -> targetValue ] points to the object { "SelfIdentifierType": "111" }.

The path [ SelfIdentifiersData -> SelfIdentifierData -> array_idx: 0 -> SelfIdentifierType -> SelfIdentifierType -> targetValue ] points to the string "111".

Note that the paths are of the type std::vector<std::string> path. So the final solution should return the value that the path points to. It should work for arbitrary depths, and also for paths that point TO array elements (second example path) and THROUGH array elements (last 2 example paths). We assume that the key for an array element at index i is "i".

Update: Currently, the approach suggested by @acm fails for paths with array indices (paths without array indices work fine). Here is all the code to reproduce the issue:

#include <iostream>

#include <bsoncxx/json.hpp>
#include <mongocxx/client.hpp>
#include <mongocxx/instance.hpp>



std::string turnQueryResultIntoString3(bsoncxx::document::element queryResult) {

    // check if no result for this query was found
    if (!queryResult) {
        return "[NO QUERY RESULT]";
    }

    // hax
    bsoncxx::builder::basic::document basic_builder{};
    basic_builder.append(bsoncxx::builder::basic::kvp("Your Query Result is the following value ", queryResult.get_value()));

    std::string rawResult = bsoncxx::to_json(basic_builder.view());
    std::string frontPartRemoved = rawResult.substr(rawResult.find(":") + 2);
    std::string backPartRemoved = frontPartRemoved.substr(0, frontPartRemoved.size() - 2);

    return backPartRemoved;
}

// TODO this currently fails for paths with array indices
bsoncxx::document::element deepFieldAccess3(bsoncxx::document::view bsonObj, const std::vector<std::string>& path) {

    if (path.empty())
        return {};

    auto keysIter = path.begin();
    const auto keysEnd = path.end();

    std::string currKey = *keysIter;    // for debug purposes
    std::cout << "Current key: " << currKey;

    auto currElement = bsonObj[*(keysIter++)];

    std::string currElementAsString = turnQueryResultIntoString3(currElement);  // for debug purposes
    std::cout << "    Query result for this key: " << currElementAsString << std::endl;


    while (currElement && (keysIter != keysEnd)) {
        currKey = *keysIter;
        std::cout << "Current key: " << currKey;

        currElement = currElement[*(keysIter++)];

        currElementAsString = turnQueryResultIntoString3(currElement);
        std::cout << "    Query result for this key: " << currElementAsString << std::endl;
    }

    return currElement;
}

// execute this function to see that queries with array indices fail
void reproduceIssue() {

    std::string testJson = "{\n"
                           "  \"shopper\": {\n"
                           "    \"Id\": \"4973860941232342\",\n"
                           "    \"Context\": {\n"
                           "      \"CollapseOrderItems\": false,\n"
                           "      \"IsTest\": false\n"
                           "    }\n"
                           "  },\n"
                           "  \"SelfIdentifiersData\": {\n"
                           "    \"SelfIdentifierData\": [\n"
                           "      {\n"
                           "        \"SelfIdentifierType\": {\n"
                           "          \"SelfIdentifierType\": \"111\"\n"
                           "        }\n"
                           "      },\n"
                           "      {\n"
                           "        \"SelfIdentifierType\": {\n"
                           "          \"SelfIdentifierType\": \"2222\"\n"
                           "        }\n"
                           "      }\n"
                           "    ]\n"
                           "  }\n"
                           "}";

    // create bson object
    bsoncxx::document::value bsonObj = bsoncxx::from_json(testJson);
    bsoncxx::document::view bsonObjView = bsonObj.view();

    // example query which contains an array index, this fails. Expected query result is "111"
    std::vector<std::string> currQuery = {"SelfIdentifiersData", "SelfIdentifierData", "0", "SelfIdentifierType", "SelfIdentifierType"};

    // an example query without array indices, this works. Expected query result is "false"
    //std::vector<std::string> currQuery = {"shopper", "Context", "CollapseOrderItems"};

    bsoncxx::document::element queryResult = deepFieldAccess3(bsonObjView, currQuery);

    std::cout << "\n\nGiven query and its result: [ ";
    for (auto i: currQuery)
        std::cout << i << ' ';

    std::cout << "] -> " << turnQueryResultIntoString3(queryResult) << std::endl;
}

Upvotes: 1

Views: 1048

Answers (1)

acm
acm

Reputation: 12727

There is not a built-in way to to do this, so you will need to write a helper function like the one you outline above.

I believe the issue you are encountering is that the argument to the function is a bsoncxx::document::view, but the return value of view::find is a bsoncxx::document::element. So you need to account for the change of type somewhere in the loop.

I think I would write the function this way:

bsoncxx::document::element deepFieldAccess(bsoncxx::document::view bsonObj, const std::vector<std::string>& path) {

    if (path.empty())
       return {};

    auto keysIter = path.begin();
    const auto keysEnd = path.end();

    auto currElement = bsonObj[*(keysIter++)];
    while (currElement && (keysIter != keysEnd))
        currElement = currElement[*(keysIter++)];

    return currElement;
}

Note that this will return an invalid bsoncxx::document::element if any part of the path is not found, or if the path attempts to traverse into an object that is not a actually a BSON document or BSON array.

Upvotes: 1

Related Questions