Reputation: 6018
I have index named test
which can be associated to n
number of documents types named sub_test_1
to sub_text_n
. But all will have same mapping.
Is there any way to make an index such all document types have same mapping for their documents? I.e. test\sub_text1\_mapping
should be same as test\sub_text2\_mapping
.
Otherwise if I have like 1000
document types, I will we having 1000 mappings of the same type referring to each document types.
UPDATE:
PUT /test_index/
{
"settings": {
"index.store.type": "default",
"index": {
"number_of_shards": 5,
"number_of_replicas": 1,
"refresh_interval": "60s"
},
"analysis": {
"filter": {
"porter_stemmer_en_EN": {
"type": "stemmer",
"name": "porter"
},
"default_stop_name_en_EN": {
"type": "stop",
"name": "_english_"
},
"snowball_stop_words_en_EN": {
"type": "stop",
"stopwords_path": "snowball.stop"
},
"smart_stop_words_en_EN": {
"type": "stop",
"stopwords_path": "smart.stop"
},
"shingle_filter_en_EN": {
"type": "shingle",
"min_shingle_size": "2",
"max_shingle_size": "2",
"output_unigrams": true
}
}
}
}
}
Intended mapping:
{
"sub_text" : {
"properties" : {
"_id" : {
"include_in_all" : false,
"type" : "string",
"store" : true,
"index" : "not_analyzed"
},
"alternate_id" : {
"include_in_all" : false,
"type" : "string",
"store" : true,
"index" : "not_analyzed"
},
"text" : {
"type" : "multi_field",
"fields" : {
"text" : {
"type" : "string",
"store" : true,
"index" : "analyzed",
},
"pdf": {
"type" : "attachment",
"fields" : {
"pdf" : {
"type" : "string",
"store" : true,
"index" : "analyzed",
}
}
}
}
}
}
}
}
I want this mapping to be an individual mapping for all sub_text
s I create so that I can change it for one sub_text
without affecting others e.g. I may want to add two custom analyzers to sub_text1
and three analyzers to sub_text3
, rest others will stay same.
UPDATE:
PUT /my-index/document_set/_mapping
{
"properties": {
"type": {
"type": "string",
"index": "not_analyzed"
},
"doc_id": {
"type": "string",
"index": "not_analyzed"
},
"plain_text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"pdf_text": {
"type": "attachment",
"fields": {
"pdf_text": {
"type": "string",
"store": true,
"index": "analyzed"
}
}
}
}
}
POST /my-index/document_set/1
{
"type": "d1",
"doc_id": "1",
"plain_text": "simple text for doc1."
}
POST /my-index/document_set/2
{
"type": "d1",
"doc_id": "2",
"pdf_text": "cGRmIHRleHQgaXMgaGVyZS4="
}
POST /my-index/document_set/3
{
"type": "d2",
"doc_id": "3",
"plain_text": "simple text for doc3 in d2."
}
POST /my-index/document_set/4
{
"type": "d2",
"doc_id": "4",
"pdf_text": "cGRmIHRleHQgaXMgaGVyZSBpbiBkMi4="
}
GET /my-index/document_set/_search
{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"type" : "d1"
}
}
}
}
}
This gives me the documents related to type "d1". How to add analyzers only to document of type "d1"?
Upvotes: 0
Views: 2518
Reputation: 22332
Do not do this.
Otherwise if I have like 1000 document types, I will we having 1000 mappings of the same type referring to each document types.
You're exactly right. For every additional _type
with an identical mapping you are needlessly adding to the size of your index's mapping. They will not be merged, nor will any compression save you.
A much better solution is to simply create a shared _type
and to create a field that represents the intended type. This completely avoids having wasted mappings and all of the negatives associated with it, including an unnecessary increase for your cluster state's size.
From there, you can imitate what Elasticsearch is doing for you and filter on your custom type without ballooning your mappings.
$ curl -XPUT localhost:9200/my-index -d '{
"mappings" : {
"my-type" : {
"properties" : {
"type" : {
"type" : "string",
"index" : "not_analyzed"
},
# ... whatever other mappings exist ...
}
}
}
}'
Then, for any search against sub_text1
(etc.), then you can do a term
(for one) or terms
(for more than one) filter to imitate the _type
filter that would happen for you.
$ curl -XGET localhost:9200/my-index/my-type/_search -d '{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"type" : "sub_text1"
}
}
}
}
}'
This is doing the same thing as the _type
filter and you can create _alias
es that contain the filter if you want to have the higher level search capability without exposing client-level logic to the filtering.
Upvotes: 0
Reputation: 685
At the moment a possible solution is to use index templates or dynamic mapping. However they do not allow wildcard type matching so you would have to use the _default_
root type to apply the mappings to all types in the index and thus it would be up to you to ensure that all your types can be applied to the same dynamic mapping. This template example may work for you:
curl -XPUT localhost:9200/_template/template_1 -d '
{
"template" : "test",
"mappings" : {
"_default_" : {
"dynamic": true,
"properties": {
"field1": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
'
Upvotes: 1