immaculate.pine
immaculate.pine

Reputation: 151

How to organize a search of nested objects with ElasticSearch?

I'm trying to organize search in my project with ElasticSearch but can't figure out one thing.

Let's simplify the context and assume that there are 2 models: Users and their Messages. So, I want to provide 2 types of search:

Messages by text (it is easy)

How it is supposed to work: user enters "notes about the meeting" and he gets a list of messages with this text.

Messages are stored in ElasticSearch like that:

{
  "id" : "1",
  "user_id" : "101",
  "text": "hello"
}

So, there are no problems to find messages by text.

Users by text (problem)

How it is supposed to work: user enters "notes about the meeting" and he gets a list of users who wrote a messages with this text.

I have few ideas how to organize it, but I don't really like any of them.

Idea 1

Find all the messages, extract their user_ids and then run SQL query like this

SELECT * FROM users WHERE id IN ('101', '102', '103')

It is the most obvious way but there is a question - how to organize proper pagination? Messages are paginated, but users are not.

Idea 2

Store users in ElasticSearch with their messages as the nested objects:

{ 
  "id" : "101",
  "name" : "Bob",
  "messages" : [
    { "id" : "1", "text" : "hello" },
    { "id" : "2", "text" : "howdy?" },
    { "id" : "3", "text" : "bye" }
  ]
}

Now I can find users by just one query to ElasticSearch. But there are few disadvantages, too:

Could you suggest me the best and the most common used way to solve this problem?

Upvotes: 0

Views: 123

Answers (1)

Prabin Meitei
Prabin Meitei

Reputation: 2000

As you have indicated it can be solved by using nested objects but a better approach would be to use parent-child relation.

The issue you may face in nested objects can be solved by using parent-child relationship(consider reading the whole section especially this.) and use has_child or has_parent queries as per your need.

It will solve the issue of the need to index whole object. But you will need to take into consideration the memory as elasticsearch sores child document id in the memory(as of now).

Upvotes: 1

Related Questions