Reputation: 36
In my database I have images like this:
{
"id": "image-1",
"type": "image",
"href": "..."
}
And books which include images:
{
"id": "book-1",
"type": "book",
"images": [
"image-1",
"image-33"
]
}
This is a many-to-many relationship because the same image can appear in more than one book, modeled using the List of Keys pattern. I can use a view to traverse the relation in the other direction and find the books in which a given image appears.
My question is: How can I find the images that do not appear in any book?
Images take up space and may have other costs associated so these would be good candidates for deletion.
My first thought was to use a view that counts the number of references to each image:
"map": function (doc) {
if (doc.type == "book") {
for (i = 0; i < doc.images.length; i++) {
emit(doc.images[i], null);
}
}
else if (doc.type == "image") {
emit(doc._id, null);
}
},
"reduce": _count
If I say that an image references itself then I can be sure that all images exist in the view and I just need to select those that are only referenced by themselves, i.e. where the count is 1. But that's where I get stuck because the view mechanism only seems to let me filter against the key, not the value (the count).
I also considered creating an index against the book.images
member and then searching for the images that don't appear in that index, but I can't find any examples along those lines.
I am aware that both my potential solutions require a linear search over all images but I'm OK with that because this is a housekeeping operation that runs occasionally and is not time-critical. To that degree my first option works well enough, it just means that I have to pull the whole view back to the client and search there when I would prefer to apply the filter in the server.
I am also aware that if I were to change the model and reverse the relation so that an image contained a list of books:
{
"id": "image-1",
"type": "image",
"href": "...",
"books": [
"book-1",
"book-12"
]
}
Then I could still find the images in a book using a view, but I could also index the image.books
member and quickly find those where the length is 0. However, this puts a burden on the application because the application edits books, and it means that whenever a user modifies the images in a book the application must also modify the images, and that introduces consistency problems because two users working with different books can end up modifying the same image.
Upvotes: 0
Views: 77
Reputation: 3501
An alternative way to do this would be to use a faceted search and return only the counts. It will still return every image. Payload would look like this:
{
"total_rows": 4,
"bookmark": "g2o",
"rows": [ ],
"counts": {
"image": {
"image-1": 3.0,
"image-2": 1.0,
"image-33": 1.0
}
}
}
Design doc would look like this:
{
"_id": "_design/bookSearch",
"_rev": "xxx",
"views": {},
"language": "javascript",
"indexes": {
"search": {
"index": "function (doc) {\n if (doc.type == \"image\") {\n index(\"image\", doc.id, {\"facet\": true});\n }\n else if (doc.type == \"book\") {\n for (var i=0; i<doc.images.length; i++) {\n index(\"image\", doc.images[i], {\"facet\": true});\n }\n }\n}\n",
"analyzer": "standard"
}
}
}
Function looks like this:
function (doc) {
if (doc.type == "image") {
index("image", doc.id, {"facet": true});
}
else if (doc.type == "book") {
for (var i=0; i<doc.images.length; i++) {
index("image", doc.images[i], {"facet": true});
}
}
}
Query would look like this:
https://xxx.cloudant.com/db/_design/bookSearch/_search/search
?q=*:*&limit=0&counts=["image"]
Not sure if there is any real advantage to doing it this way though...
Upvotes: 0