Torben
Torben

Reputation: 1290

Hierarchical/parent-chid mapping with ElasticSearch

I've data which looks like:

{
   "id": 321,
   "name": "Parent 1",
   "childs":[
      {
         "id": 3211,
         "name": "Child 1",
         "data": "Some data"
      },
      {
         "id": 3212,
         "name": "Child 2"
      },
      {
         "id": 3213,
         "name": "Child 3"
      }
   ]
}

Now I want to query ElasticSearch for all childs which has no "data" to get a result like this:

[
   {
      "id":321,
      "childs":[
         3212,
         3213
      ]
   }
]

I read something about nested objects and parent-child relations in the documentation. I think I need something like a parent-child relation, an _id for the childs and query only for the _ids and not the sources.

Can anybody help me with this?

Thank you

Upvotes: 0

Views: 1504

Answers (1)

Andrei Stefan
Andrei Stefan

Reputation: 52368

You definitely need parent-child. Here's the mapping:

POST /my_index
{
  "mappings": {
    "parents": {
      "properties": {
        "name": {
          "type": "string"
        }
      }
    },
    "children": {
      "_parent": {
        "type": "parents"
      },
      "properties": {
        "id": {
          "type": "integer"
        },
        "name": {
          "type": "string"
        },
        "data": {
          "type": "string"
        }
      }
    }
  }
}

Sample data:

POST /my_index/parents/_bulk
{"index":{"_id": 1}}
{"name":"Parent 1"}
{"index":{"_id": 2}}
{"name":"Parent 2"}
{"index":{"_id": 3}}
{"name":"Parent 3"}

POST /my_index/children/_bulk
{"index":{"_id": 1, "parent": 1}}
{"name":"Child 1","data":"Some data"}
{"index":{"_id": 2, "parent": 1}}
{"name":"Child 2"}
{"index":{"_id": 3, "parent": 1}}
{"name":"Child 3"}
{"index":{"_id": 4, "parent": 2}}
{"name":"Child 4","data":"Some data 4"}
{"index":{"_id": 5, "parent": 2}}
{"name":"Child 5","data":"Some data 5"}
{"index":{"_id": 6, "parent": 2}}
{"name":"Child 6"}
{"index":{"_id": 7, "parent": 3}}
{"name":"Child 7","data":"Some data 7"}
{"index":{"_id": 8, "parent": 3}}
{"name":"Child 8","data":"Some data 8"}
{"index":{"_id": 9, "parent": 3}}
{"name":"Child 9","data":"Some data 9"}

And the query itself:

GET my_index/children/_search
{
  "size": 0, 
  "query": {
    "filtered": {
      "filter": {
        "missing": {
          "field": "data"
        }
      }
    }
  },
  "aggs": {
    "missing_data": {
      "terms": {
        "field": "_parent"
      },
      "aggs": {
        "top_children": {
          "top_hits": {
            "_source": "_parent"
          }
        }
      }
    }
  }
}

In the format you want is not possible, but the above query will get you back the ID of the parent and for each parent the IDs of the children grouped by parent.

Upvotes: 2

Related Questions