Reputation: 61
I'm using golang and its official mongo db driver, and I want to save documents is in the following structure:
type BlacklistRecord struct {
ID string `bson:"_id" json:"id"`
Type string `bson:"type" json:"type"`
Value string `bson:"value" json:"value"`
Source map[string]int `bson:"source" json:"source"`
LastUpdate string `bson:"lastUpdate" json:"lastUpdate"`
}
this is what is saved into database as a sample:
{
_id: '1b836f704c884d28',
type: 'url',
value: 'smtp.clarinda.bluehornet.com',
source: {
'https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt': 1
},
lastUpdate: '2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025'
}
What I want to do is searching for documents which at least one of their sources contains a sub string (case insensitive). The source value is a map itself, which its key is the source URL and the value is the number of repeats in that source url. I have tried a lot but I couldn't do much. I know I can use:
key := bson.M{
"$regex": primitive.Regex{
Pattern: ".*" + value + ".*", Options: "i",
}
this only works for value of key. what about search for the key itself? for example if someone give me "hosTfiLes" I should return the records which inside the source field of them, a key with this expression (case insensitive) exists. Thank you for your helps.
Upvotes: 2
Views: 782
Reputation: 432
I'm not sure if it directly works with find
and $regex
. It's better to try it in mongo first. Then implement in Go. Sample data:
/* 1 */
{
"_id" : "1b836f704c884d28",
"type" : "url",
"value" : "smtp.clarinda.bluehornet.com",
"source" : {
"https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt" : 1.0
},
"lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}
/* 2 */
{
"_id" : "1b836f704c884d29",
"type" : "url",
"value" : "smtp.clarinda.bluehornet.org",
"source" : {
"https://hostfiles.frogeye.fr/firstparty-trackers-hosts.csv" : 1.0
},
"lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}
/* 3 */
{
"_id" : "1b836f704c884d30",
"type" : "url",
"value" : "smtp.clarinda.bluehornet.org",
"source" : {
"https://hostfiles.frogeye.fr/firstparty-trackers-hosts.csv" : 1.0,
"https://hostfiles.frogeye.fr/firstparty-trackers-hosts.html" : 2.0
},
"lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}
For instance, if we are searching for sources that end with .csv
, record 2 has one source and record 3 has 1 of 2 sources that match our requirement. The following aggregate function gives the expected result.
db.getCollection('blacklist').aggregate([
{
$addFields: {
doc: { $objectToArray: "$source" }
}
},
{
$match: {
"doc.k": {$regex: '.csv$'},
}
},
{
$project: {"doc":0},
}
])
Now to implement the same in Go, the code snippet:
pipeline := mongo.Pipeline{
{{
Key: "$addFields",
Value: bson.M{
"doc": bson.M{"$objectToArray": "$source"},
},
}},
{{
Key: "$match",
Value: bson.M{
"doc.k": bson.M{
"$regex": ".csv$",
},
},
}},
{{
Key: "$project",
Value: bson.M{"doc": 0},
}},
}
cursor, err := collection.Aggregate(ctx, pipeline)
if err != nil {
log.Fatal(err)
}
var result []BlacklistRecord
if err = cursor.All(ctx, &result); err != nil {
log.Fatal(err)
}
However, for this you need to introduce new field in the struct which you can exclude in JSON though.
type BlacklistRecord struct {
ID string `bson:"_id" json:"id"`
Type string `bson:"type" json:"type"`
Value string `bson:"value" json:"value"`
Source map[string]int `bson:"source" json:"source"`
LastUpdate string `bson:"lastUpdate" json:"lastUpdate"`
Doc []KV `bson:"doc"` // json tag is exempted
}
type KV struct {
Key string `bson:"k"`
// The value field here is exempted.
}
Code snippet on Go Playground. Update the creds and host:port as per your server if you are trying the same in local.
References:
aggregate
for this: https://www.mongodb.com/community/forums/t/how-do-i-specify-a-document-keys-value-as-regex-expression-to-find-a-document-in-mongodb/4934/2$project
for filtering: https://www.codegrepper.com/code-examples/whatever/mongodb+aggregate+remove+fieldUpvotes: 1