NeuralCode
NeuralCode

Reputation: 195

Elasticsearch question, should I have duplicate data along 2 different indices? Not sure how to set up the data

Edit: 3 different incides. Sorry about the title :c

I am trying to grasp elasticsearch as fast as I can but I think I've confused myself majorly here. How should I set this data up?

I have 3 major searches:

 1: Search by pokemon name. Eg: Show all Charizard in the system.

 2: Search by trainer name Eg: Show all of John Doe's pokemon/checkins at the pokecenter.

 3: Search by checkins at the pokecenter.

Should each of these be in their own separate index? I am absolutely from an SQL background primarily so I want to have separate tables for all of these. But that isn't how elasticsearch works... so I am really confused here.

Should I have a separate index for each pokemon?

And then another separate index for each trainer?

And then another separate index for each checkin at the pokecenter?


Query return examples

1: Search by pokemon name.

{
     1 : {
            id: 9239329,
            pokeId: 6,
            name: Charizard,
            trainerId: 2932
                }
}

2: Search by trainer name

{
     1 : {
            id: 2932,
            name: John Doe,
            pokemon: [
                        9239329
                      ]
                }
}

3: Search by checkins at the pokecenter.

{
     1 : {
            id: 3232,
            date: 11/11/1111,
            pokemon: [
                        9239329
                      ],
            trainerId: 2932
                }
}

But if I have a separate index.... and index for EACH of these ... while that would be fast wouldn't that just be crazy horrendous data duplication?

Upvotes: 0

Views: 74

Answers (1)

Shachaf.Gortler
Shachaf.Gortler

Reputation: 5745

It depends on the scope of the project :

  1. the ideal way is to have each one as it's separate index this allows you to scale them differently if needed and move them to another cluster and also allow each one to have different replica settings

  2. The quick way , is to have the checkins as an index and the trainer as a nested object , and under that the pokemon as a nested object. note: nested queries are slower, and writing the queries to return exactly what you want is a little tricker.

Upvotes: 1

Related Questions