Jainendra
Jainendra

Reputation: 25153

Querying many to many mapping in ElasticSearch

I have two types in my ElasticSearch index.

Product- Stores all products information

{
   "ProductId":"P1",
   "Name":"Refrigerator"
}

Owner- Stores all products with owner X (CSV)

{
   "OwnerId":"o-id1",
   "Products":"P1,P2,P3,...,Pn"
}

Note:

  1. One product may have multiple owners.

  2. One Owner can handle multiple products

Now, to build a query to retrieve information about all products with a particular owner. I first query Owner type and get all the product ids then I query Product type and pass the productId's obtained, using term query. But this makes the query very slow as the number of products could be very high (100,000). Also, I want to avoid two queries.

Is there a better way I can model these two types so that the queries could be faster?

Upvotes: 2

Views: 557

Answers (1)

Alkis Kalogeris
Alkis Kalogeris

Reputation: 17745

According to this https://discuss.elastic.co/t/how-to-handle-many-to-many-relationships/47864 and many other resources out there your use case is better suited with a traditional SQL solution. If you really must use ES for this, then duplication is the solution. Based on your use case I believe that having an index that contains the id of the owner as the document id and a field (or a nested field) that contains all the products.

The thing is, do you really need to keep all the fields of the product here? You could duplicate only the search fields. In general ES is not suited to be a primary storage solution. In ES you keep only the fields that you search against (in a form that duplication cannot be avoided and it's welcomed for its performance benefits) and then you have a primary storage solution (traditional sql) which is the place you go when you want to retrieve all the fields for presentation (and of course you keep it synced with ES).

If you can't have that, meaning that you need to store all your data in ES then duplication is again the answer, but you can make some optimisations that will improve the size of your index (e.g. do not analyse the fields that you don't search against or you search by exact match - keyword type, you disable the _all field if you are using a version that is enabled by default etc.).

Another possible solution could be this functionality https://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child.html. Check it out and see if you can make it work for you. This is the way to accomplish the one-to-many functionality so I believe again with some duplication you can achieve what want. Read this https://discuss.elastic.co/t/can-we-give-parent-child-relation-ship-between-different-indexes/25872/2 why this is a half measure (since you can't have the parent in one index and the children in another).

Upvotes: 3

Related Questions