Reputation: 93
I want to fetch documents present in multiple types (type1 AND type2 AND type3...) in Elastic Search 5.0 . I know searching across multiple types is possible by using multiple types like type1,type2 in URL and by also filtering the _type field. But all these conditions are OR (type1 OR type2). How do I achieve the AND condition?
Here are two documents in my ES,
{
"_index":"cust_58e8700034fa4e368590fb1396e2641c",
"_type":"unique-fp-domains",
"_id":"n_d4dbba7309a94503b25eca735078f17c_258b3ad1a11aba282f35908662bdc5432d68fd96bf3ca90013dcdd5764331399",
"_version":2,
"_score":1,
"_source":{
"mg_timestamp":1579866709096,
"violated-directive":"connect-src",
"fp-hash":"258b3ad1a11aba282f35908662bdc5432d68fd96bf3ca90013dcdd5764331399",
"time":1579866709096,
"scan-id":"n_d4dbba7309a94503b25eca735078f17c",
"blocked-uri":"play.sundaysky.com"
}
}
{
"_index":"cust_58e8700034fa4e368590fb1396e2641c",
"_type":"tag-alexa-top1k-using-csp-tld-domain",
"_id":"AW_XY4P4kmprPQ28bTUb",
"_version":1,
"_score":1,
"_source":{
"tagged-domain":"sundaysky.com",
"tag-guidance":"FP",
"additional-tag-metadata-isbase64-encoded":"eyJ0b3RhbC1hbGV4YS1tYXRjaGVzIjoyMzh9",
"project-id":2,
"fp-hash":"258b3ad1a11aba282f35908662bdc5432d68fd96bf3ca90013dcdd5764331399",
"scan-id":"n_d4dbba7309a94503b25eca735078f17c",
}
}
I want to fetch the documents from the same index from the given 2 types with "scan-id":"n_d4dbba7309a94503b25eca735078f17c"
I tried this,
{
"size": 100,
"query": {
"bool": {
"must": [
{
"bool": {
"filter": [
{
"term": {
"_type": {
"value": "tag-alexa-top1k-using-csp-tld-domain"
}
}
},
{
"term": {
"scan-id": {
"value": "n_d4dbba7309a94503b25eca735078f17c"
}
}
}
]
}
},
{
"bool": {
"filter": [
{
"term": {
"_type": {
"value": "unique-fp-domains"
}
}
},
{
"term": {
"scan-id": {
"value": "n_d4dbba7309a94503b25eca735078f17c"
}
}
}
]
}
}
]
}
}
}
But it doesn't work.
Upvotes: 3
Views: 439
Reputation: 6066
Elasticsearch is not good in joining different collections of documents, but in your case you might be able to solve your issue with parent-child
relationship.
In case when you have a one-to-many relationship you can model it with parent-child
. Let's suppose that type unique-fp-domains
is "parent" type and scan-id
field is a unique identifier. Let's also suppose that tag-alexa-top1k-using-csp-tld-domain
is a "child" and every document of type tag-alexa-top1k-using-csp-tld-domain
refers to exactly 1 document in unique-fp-domains
.
Then we should create the Elasticsearch mapping in the following way:
PUT /cust_58
{
"mappings": {
"unique-fp-domains": {},
"tag-alexa-top1k-using-csp-tld-domain": {
"_parent": {
"type": "unique-fp-domains"
}
}
}
}
And insert the documents like this:
# "parent"
PUT /cust_58/unique-fp-domains/n_d4dbba7309a94503b25eca735078f17c
{
"mg_timestamp": 1579866709096,
"violated-directive": "connect-src",
"fp-hash": "258b3ad1a11aba282f35908662bdc5432d68fd96bf3ca90013dcdd5764331399",
"time": 1579866709096,
"scan-id": "n_d4dbba7309a94503b25eca735078f17c",
"blocked-uri": "play.sundaysky.com"
}
# "child"
POST /cust_58/tag-alexa-top1k-using-csp-tld-domain?parent=n_d4dbba7309a94503b25eca735078f17c
{
"tagged-domain": "sundaysky.com",
"tag-guidance": "FP",
"additional-tag-metadata-isbase64-encoded": "eyJ0b3RhbC1hbGV4YS1tYXRjaGVzIjoyMzh9",
"project-id": 2,
"fp-hash": "258b3ad1a11aba282f35908662bdc5432d68fd96bf3ca90013dcdd5764331399",
"scan-id": "n_d4dbba7309a94503b25eca735078f17c"
}
Now we will be able to query for parent objects having any child associated with it == join on parent ID, which is we forced to be scan-id
by providing the _id
of the document manually.
The query will use has_child
and will look like this:
POST /cust_58/unique-fp-domains/_search
{
"query": {
"has_child": {
"type": "tag-alexa-top1k-using-csp-tld-domain",
"query": {
"match_all": {}
},
"inner_hits": {}
}
}
}
Note that we use inner_hits
to tell Elasticsearch to retrieve the matched "child" documents.
The output would look like:
"hits": [
{
"_index": "cust_58",
"_type": "unique-fp-domains",
"_id": "n_d4dbba7309a94503b25eca735078f17c",
"_score": 1.0,
"_source": {
"mg_timestamp": 1579866709096,
"violated-directive": "connect-src",
...
},
"inner_hits": {
"tag-alexa-top1k-using-csp-tld-domain": {
"hits": {
"total": 1,
"max_score": 1.0,
"hits": [
{
"_type": "tag-alexa-top1k-using-csp-tld-domain",
"_id": "AW_xhfnnIzWDkoWd1czA",
"_score": 1.0,
"_routing": "n_d4dbba7309a94503b25eca735078f17c",
"_parent": "n_d4dbba7309a94503b25eca735078f17c",
"_source": {
"tagged-domain": "sundaysky.com",
...
}
parent-child
?If you care about query performance you should not use this query.
In Elasticsearch 6, types have been removed. The good news are that already starting from Elasticsearch 5 one can use join
datatype.
In general, Elasticsearch is not very good to manage relations between objects, but there are few ways to deal with them.
Hope that helps!
Upvotes: 1
Reputation: 103
"query": {
"query_string" : {
"query" : "(_type : unique-fp-domains OR tag-alexa-top1k-using-csp-tld-domain) AND (scan-id : n_d4dbba7309a94503b25eca735078f17c)
}
}
Upvotes: 0
Reputation: 31
I think this query will figure out your problem;
"query": {
"bool": {
"must": [
{
"terms": {
"_type": "tag-alexa-top1k-using-csp-tld-domain"
}
},
{
"terms": {
"_type": "unique-fp-domains"
}
}
],
"filter": [
{
"scan-id": {
"_type": "n_d4dbba7309a94503b25eca735078f17c"
}
}
]
}
}
Upvotes: 1
Reputation: 1
you could use a msearch. This can combine multiple searches. You can find more information about this at their documentation. https://www.elastic.co/guide/en/elasticsearch/reference/current/search-multi-search.html
Upvotes: -1