pcejrowski
pcejrowski

Reputation: 622

Druid with Kafka Ingestion: filtering data

is it possible to filter data by dimension value during ingestion from Kafka to Druid?

e.g. Considering dimension: version, which might have values: v1, v2, v3 I would like to have only v2 loaded.

I realize it can be done using Spark/Flink/Kafka Streams, but maybe there is an out-of-the-box solution

Upvotes: 2

Views: 1115

Answers (2)

LifeQuery
LifeQuery

Reputation: 3282

You can do this with transformSpec during ingestion.
http://druid.io/docs/latest/ingestion/transform-spec.html

Per the documentation:

Transform specs allow Druid to filter and transform input data during ingestion.

Any query filters can be applied to this.

Example usage with NOT filter:

"transformSpec": {
  "filter": {
    "type": "and",
    "fields": [
      {
        "type": "not",
        "field": {
          "type": "selector",
          "dimension": "my_dimension",
          "value": "filter_me"
        }
      },
      {
        "type": "not",
        "field": {
          "type": "selector",
          "dimension": "my_dimension",
          "value": "filter_me_also"
        }
      }
    ]
  },
  "transforms": []
}

Upvotes: 3

Slim Bouguerra
Slim Bouguerra

Reputation: 359

Not possible from druid side you need to filter the data before hand.

Upvotes: 1

Related Questions