Amirhossein Dolatkhah
Amirhossein Dolatkhah

Reputation: 61

Search for keys of a document in MongoDB and Golang

I'm using golang and its official mongo db driver, and I want to save documents is in the following structure:

type BlacklistRecord struct {
    ID         string         `bson:"_id" json:"id"`
    Type       string         `bson:"type" json:"type"`
    Value      string         `bson:"value" json:"value"`
    Source     map[string]int `bson:"source" json:"source"`
    LastUpdate string         `bson:"lastUpdate" json:"lastUpdate"`
}

this is what is saved into database as a sample:

{
    _id: '1b836f704c884d28',
    type: 'url',
    value: 'smtp.clarinda.bluehornet.com',
    source: {
        'https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt': 1
    },
    lastUpdate: '2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025'
}

What I want to do is searching for documents which at least one of their sources contains a sub string (case insensitive). The source value is a map itself, which its key is the source URL and the value is the number of repeats in that source url. I have tried a lot but I couldn't do much. I know I can use:

key := bson.M{
     "$regex": primitive.Regex{
     Pattern: ".*" + value + ".*", Options: "i",
   }

this only works for value of key. what about search for the key itself? for example if someone give me "hosTfiLes" I should return the records which inside the source field of them, a key with this expression (case insensitive) exists. Thank you for your helps.

Upvotes: 2

Views: 782

Answers (1)

vague
vague

Reputation: 432

I'm not sure if it directly works with find and $regex. It's better to try it in mongo first. Then implement in Go. Sample data:

/* 1 */
{
    "_id" : "1b836f704c884d28",
    "type" : "url",
    "value" : "smtp.clarinda.bluehornet.com",
    "source" : {
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt" : 1.0
    },
    "lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}

/* 2 */
{
    "_id" : "1b836f704c884d29",
    "type" : "url",
    "value" : "smtp.clarinda.bluehornet.org",
    "source" : {
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.csv" : 1.0
    },
    "lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}

/* 3 */
{
    "_id" : "1b836f704c884d30",
    "type" : "url",
    "value" : "smtp.clarinda.bluehornet.org",
    "source" : {
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.csv" : 1.0,
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.html" : 2.0
    },
    "lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}

For instance, if we are searching for sources that end with .csv, record 2 has one source and record 3 has 1 of 2 sources that match our requirement. The following aggregate function gives the expected result.

db.getCollection('blacklist').aggregate([ 
    { 
        $addFields: { 
            doc: { $objectToArray: "$source" } 
        } 
    }, 
    { 
        $match: {
            "doc.k": {$regex: '.csv$'},
        } 
    },
    {
        $project: {"doc":0},
    }
])

Now to implement the same in Go, the code snippet:

pipeline := mongo.Pipeline{
    {{
        Key: "$addFields",
        Value: bson.M{
            "doc": bson.M{"$objectToArray": "$source"},
        },
    }},
    {{
        Key: "$match",
        Value: bson.M{
            "doc.k": bson.M{
                "$regex": ".csv$",
            },
        },
    }},
    {{
        Key:   "$project",
        Value: bson.M{"doc": 0},
    }},
}

cursor, err := collection.Aggregate(ctx, pipeline)
if err != nil {
    log.Fatal(err)
}

var result []BlacklistRecord
if err = cursor.All(ctx, &result); err != nil {
    log.Fatal(err)
}

However, for this you need to introduce new field in the struct which you can exclude in JSON though.

type BlacklistRecord struct {
    ID         string         `bson:"_id" json:"id"`
    Type       string         `bson:"type" json:"type"`
    Value      string         `bson:"value" json:"value"`
    Source     map[string]int `bson:"source" json:"source"`
    LastUpdate string         `bson:"lastUpdate" json:"lastUpdate"`
    Doc        []KV           `bson:"doc"` // json tag is exempted
}

type KV struct {
    Key string `bson:"k"`
    // The value field here is exempted.
}

Code snippet on Go Playground. Update the creds and host:port as per your server if you are trying the same in local.

References:

  1. Usage of aggregate for this: https://www.mongodb.com/community/forums/t/how-do-i-specify-a-document-keys-value-as-regex-expression-to-find-a-document-in-mongodb/4934/2
  2. Using $project for filtering: https://www.codegrepper.com/code-examples/whatever/mongodb+aggregate+remove+field

Upvotes: 1

Related Questions