virupaksha
virupaksha

Reputation: 373

Extract sub object in array of a document with pymongo

I have multiple documents and each document has a set of tweets. I can find the document by name as follows:

client = MongoClient('localhost', 27017)
db = client['sample_app']
s = db['s']
s.find(
            {
                "name": "temp16"
            }
        )

When I run the above query I get the following data:

{"_id": {"$oid": "5e57db66c6bb04eb902589a2"}, "name": "temp16", "tweets": [{"tweet_id": "1234762637361086465", "tweet_text": "Had an extensive review regarding preparedness on the COVID-19 Novel Coronavirus. Different ministries & states are working together, from screening people arriving in India to providing prompt medical attention.", "tweet_handle": "@narendramodi", "labels": ["A", "B", "C", "D", "E"]}, {"tweet_text": "There is no need to panic. We need to work together, take small yet important measures to ensure self-protection.", "tweet_id": "1234762662413660165", "tweet_handle": "@narendramodi", "labels": ["A", "B", "C", "D", "E", "F"]}]}

My intention is to get the tweet with id "1234762662413660165" in this document alone. So I try the following:

s.find(
            {
                "name": "temp16",
                'tweets': {"tweet_id": "1234762662413660165"}
            },
        )

However I get None

What am I doing wrong?

Upvotes: 1

Views: 191

Answers (2)

Dĵ ΝιΓΞΗΛψΚ
Dĵ ΝιΓΞΗΛψΚ

Reputation: 5669

here's two ways of doing it using aggregation pipelines:

db.collection.aggregate(
    { $match: { name: 'temp16' } },
    { $unwind: '$tweets' },
    { $match: { 'tweets.tweet_id': '1234762662413660165' } },
    { $replaceWith: '$tweets' }
)

db.collection.aggregate(
    { $match: { name: 'temp16' } },
    {
        $replaceWith: {
            $arrayElemAt: [
                {
                    $filter: {
                        input: "$tweets",
                        as: "tweet",
                        cond: { $eq: ["$$tweet.tweet_id", '1234762662413660165'] }
                    }
                }, 0]
        }
    }
)

first one is short and sweet but it has the added overhead of unwinding and creating documents in memory.

Upvotes: 0

Belly Buster
Belly Buster

Reputation: 8814

You need to use $elemMatch

import pymongo
db = pymongo.MongoClient()['mydatabase']
db.mycollection.insert_one({"name": "temp16", "tweets": [{"tweet_id": "1234762637361086465", "tweet_text": "Had an ...", "tweet_handle": "@narendramodi", "labels": ["A", "B", "C", "D", "E"]}, {"tweet_text": "There is ...", "tweet_id": "1234762662413660165", "tweet_handle": "@narendramodi", "labels": ["A", "B", "C", "D", "E", "F"]}]})

tweets = db.mycollection.find({"name": "temp16", 'tweets': {'$elemMatch': {"tweet_id": "1234762662413660165"}}})

for tweet in tweets:
    print(tweet)

Upvotes: 1

Related Questions