Reputation: 1944
I have an index 'tweets' and 2 types 'active' and 'inactive'. When I create a document I use the code below (for node.js) to create the document in tweets\active.
When the tweet is deleted, I dont want to completely delete the document, but I want to "move" the document (per se), to the type 'inactive' so I can preserve the document along with its _id etc for internal use.
How do I change the document type? Any ideas?
client.create({
index: 'tweets',
type: 'active',
body: jsonData
}, function (error, response) {
if (error)
return callback("ERROR");
if (response)
return callback(response._id);
});
Upvotes: 0
Views: 42
Reputation: 22332
You cannot really move a document. In an odd way, you can, but it's not really the expected approach and it definitely has quirks:
curl -XPOST localhost:9200/tweets/active/tweet-to-move/_update -d '{
"doc" : {
"_type" : "inactive"
}
}'
The above update takes advantage of the fact that your type is really just a top-level metadata field of the document (_type
). Doing so is all sorts of wrong, not least of which because it modifies the _source
. All documents in the same index are stored in together on the same shard(s), which is why that sort of works (note: it ends up in both types in 1.2.2).
While you definitely do not want to use the above example, you should do something similar.
Instead of creating two separate types -- since they live on the same index and are otherwise identical anyway -- use only a single type with an active
(or, conversely, an inactive
) field or create two separate indices (which may yield better performance over time as the number of inactive tweets grows).
curl -XPUT localhost:9200/tweets -d '{
"mappings" : {
"tweet" : {
"properties" : {
"user" : {
"type" : "string",
"index" : "not_analyzed"
},
"message" : {
"type" : "string"
},
"inactive" : {
"type" : "boolean"
}
}
}
}
}'
Now, getting back to your split types, you can use aliases to accomplish the same thing, but with the appearance that they have been moved/removed. Aliases can be added dynamically or when the index is created:
curl -XPUT localhost:9200/tweets -d '{
"mappings" : {
"tweet" : {
...
}
},
"aliases" : {
"active" : {
"filter" : {
"bool" : {
"must_not" : {
"term" : { "inactive" : true }
}
}
}
},
"inactive" : {
"filter" : {
"term" : { "inactive" : true }
}
}
}
}'
With the aliases setup, you can now "move" the document by updating the inactive
field (no movement is really happening; the document stays on the same index and even the same shard).
Once the mapping is created (a necessary step for the filtered alias, which were new in 1.4), then you can start inserting active-by-default documents as you see fit:
curl -XPUT localhost:9200/tweets/tweet/12345 -d '{
"user" : "kimchy"
"message" : "Trying out Elasticsearch Aliases!"
}'
When you decide that they are inactive, then simply update it:
curl -XPOST localhost:9200/tweets/tweet/12345/_update -d '{
"doc" : {
"inactive" : true
}
}'
To search for active documents, then you can just use the alias:
# Assumes there is only one type defined (otherwise it searches all of them):
curl -XGET localhost:9200/active/_search -d '{
"query" : { "match_all" : { } }
}'
# Searches only active tweets
curl -XGET localhost:9200/active/tweet/_search -d '{
"query" : { "match_all" : { } }
}'
And inactive documents:
curl -XGET localhost:9200/inactive/_search -d '{
"query" : { "match_all" : { } }
}'
curl -XGET localhost:9200/inactive/tweet/_search -d '{
"query" : { "match_all" : { } }
}'
Note: If you want to search for both, don't waste time using the aliases and touch the index directly:
curl -XGET localhost:9200/tweets/_search -d '{
"query" : { "match_all" : { } }
}'
With all of that said, there are two minor downsides to this approach:
It requires that a filter be used to find active/inactive documents. This is cached on first use, so it's incredibly quick, but it may be an unnecessary step that benefits from the solution to #2.
It's perhaps useful to note that the same filter is used for both aliases above, and therefore it only needs to be cached once (then it's inverted upon demand).
All documents live on the same index and therefore the same shards. As time goes on, you will most likely have a lot of useless, inactive documents cluttering the shards. If this is actually a problem, then you can either start deleting old, inactive documents or you can use two indices (requiring an index, then delete or "move"); using two indices means that you can drop the filters. Interestingly, you could combine this by keeping recently inactive documents in the same index, and have another index that things get moved to after a long time, then update the inactive
alias to include both the filtered index and the old one.
Upvotes: 0
Reputation: 228
you cannot change the type of a document (not that I know of at least).
Why don't you abstract the ID, you keep the technical _id for technical use and give your document a nice functional id to use in you app! You can then delete your active document and create your inactive one, keeping the functional ID.
Or event better, add an active / inactive flag in your document so you just flag your document as deleted and you make a nice alias "active" which filters out the inactive documents. That way you can request your active documents in a super nice way.
Doc for the aliases -> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html
Upvotes: 2