Dmitry
Dmitry

Reputation: 1

How to enrich multiple values in Elasticsearch using an ingest pipeline?

I have an Elasticsearch document that contains a field with multiple values, and I want to enrich each of these values using an ingest pipeline. Here's an example of my document structure: { "http.rule.id": ["b41912851a064912b2a589f3a21d0c57", "82045c5fd30045d893272fd8b74e93d6"] }

and there is an index for enrichment

{
  "_index": "enrich-content",
  "_id": "b41912851a064912b2a589f3a21d0c57",
  "_score": 1,
  "_source": {
    "description": "description1",
    "name": "name1",
    "location": "location1",
    "id": "b41912851a064912b2a589f3a21d0c57"
  }
},
{
  "_index": "enrich-content",
  "_id": "82045c5fd30045d893272fd8b74e93d6",
  "_score": 1,
  "_source": {
    "description": "description2",
    "name": "name2",
    "location": "location2",
    "id": "82045c5fd30045d893272fd8b74e93d6"
  }
},
{
  "_index": "enrich-content",
  "_id": "eda384884ff545ae957bfccf47aaba1f",
  "_score": 1,
  "_source": {
    "description": "description3",
    "name": "name3",
    "location": "location3",
    "id": "eda384884ff545ae957bfccf47aaba1f"
  }
}

I have two ingest pipelines for enrichment:

Enrichment pipeline 1:

{
  "enrich": {
    "field": "http.rule.id",
    "policy_name": "policy_enrich",
    "target_field": "http.description",
    "ignore_missing": true,
    "ignore_failure": true
  }
}

Enrichment pipeline 2:

{
  "foreach": {
    "field": "http.rule.id",
    "processor": {
      "enrich": {
        "field": "_ingest._value",
        "policy_name": "cf-firewall-content",
        "target_field": "http.description",
        "ignore_missing": true,
        "ignore_failure": true
      }
    },
    "ignore_failure": true
  }
}

However, both pipelines only enrich by the last ID in the array (82045c5fd30045d893272fd8b74e93d6).

What I want to achieve is to enrich all the IDs and have a result like this: http.description: ["description1", "description2"]

Can someone please help me modify my Elasticsearch ingest pipeline configuration to achieve this?

Upvotes: 0

Views: 604

Answers (1)

Val
Val

Reputation: 217464

You can leverage the max_matches setting of the enrich processor in order to process all elements of your array. The default value is 1 which is why only one element is matched:

  {
    "enrich": {
      "field": "http.rule.id",
      "policy_name": "source-policy",
      "target_field": "http.description",
      "max_matches": 2,                         <---- add this
      "ignore_missing": true,
      "ignore_failure": true
    }
  }

And then you'll get another array in the http.description target field with one description element per matching id:

       {
          "http" : {
            "rule" : {
              "id" : [
                "b41912851a064912b2a589f3a21d0c57",
                "82045c5fd30045d893272fd8b74e93d6"
              ]
            },
            "description" : [
              {
                "description" : "description1",
                "id" : "b41912851a064912b2a589f3a21d0c57"
              },
              {
                "description" : "description2",
                "id" : "82045c5fd30045d893272fd8b74e93d6"
              }
            ]
          }
        }

You can then add a script processor to "massage" your description array differently if needed.

Upvotes: 1

Related Questions