Jay
Jay

Reputation: 2244

How to exclude fields from being indexed in ElasticSearch?

I am trying to utilize ElasticSearch to store large sets of data. Most of the data will be searchable, however, there are some field that will be there just so the data is stored and returned upon request.

Here is my mapping

{
  "mappings": {
    "properties": {
      "amenities": {
        "type": "completion"
      },
      "summary": {
        "type": "text"
      },
      "street_number": {
        "type": "text"
      },
      "street_name": {
        "type": "text"
      },
      "street_suffix": {
        "type": "text"
      },
      "city": {
        "type": "text",
        "fields": {
          "raw": { 
            "type":  "keyword"
          }
      },
      "state_or_province": {
        "type": "text"
      },
      "postal_code": {
        "type": "text"
      },
      "mlsid": {
        "type": "text"
      },
      "source_id": {
        "type": "text"
      },
      "status": {
        "type": "keyword"
      },
      "type": {
        "type": "keyword"
      },
      "subtype": {
        "type": "keyword"
      },
      "year_built": {
        "type": "short"
      },
      "community": {
        "type": "keyword"
      },
      "elementary_school": {
        "type": "keyword"
      },
      "middle_school": {
        "type": "keyword"
      },
      "jr_high_school": {
        "type": "keyword"
      },
      "high_school": {
        "type": "keyword"
      },
      "area_size": {
        "type": "double"
      },
      "lot_size": {
        "type": "double"
      },
      "bathrooms": {
        "type": "double"
      },
      "bedrooms": {
        "type": "double"
      },
      "listed_at": {
        "type": "date"
      },
      "price": {
        "type": "double"
      },
      "sold_at": {
        "type": "date"
      },
      "sold_for": {
        "type": "double"
      },
      "total_photos": {
        "type": "short"
      },
      "formatted_addressLine": {
        "type": "text"
      },
      "formatted_address": {
        "type": "text"
      },
      "location": {
        "type": "geo_point"
      },
      "price_changes": {
        "type": "object"
      },
      "fields": {
        "type": "object"
      },
      "deleted_at": {
        "type": "date"
      },
      "is_available": {
        "type": "boolean"
      },
      "is_unable_to_find_coordinates": {
        "type": "boolean"
      },
      "source": {
        "type": "keyword"
      }
    }
  }
}

The fields and price_changes properties are there in case the user want to read that info. But that info should not be searchable or indexed. The fields holds a large list of key-value pairs whereas price_changes fields hold multiple objects of the same type.

Currently, when I attempt to bulk create records, I get Limit of total fields [1000] has been exceeded error. I am guessing this error is happening because every key-value pair in the fields collection is considered a field in elasticsearch.

How can I store the fields and the price_changes object as non-searchable data and not index it or count it toward the fields count?

Upvotes: 1

Views: 1725

Answers (1)

Barkha Jain
Barkha Jain

Reputation: 768

You could use the enabled property at field level to store the fields without indexing them. Read here https://www.elastic.co/guide/en/elasticsearch/reference/current/enabled.html

  "price_changes": { 
    "type": "object",
    "enabled": false
  }

NOTE: Are you able to create an index using the mapping you gave in the question? It gives me syntax errors(Duplicate key) at "type" field. I think you are missing a closing bracket for "city" field.

Upvotes: 2

Related Questions