Marek
Marek

Reputation: 17

elassandra geopoint mapping fails

I cannot map a geopoint with elassandra and the cassandra-express driver.

geopoint UDT:

manageESIndex: true,
udts: {
  geopoint: {
    lat: 'double',
    lon: 'double'
  }
}

cassandra table elastic mapping:

location: {
  type: 'frozen',
  typeDef: '<geopoint>'
}
...
es_index_mapping: {
  discover: '.*',
  properties: {
    "location": {
      "type": "geo_point"
    }
  }
}

The resulting elastic mapping is:

"location": {
    "type": "nested",
    "cql_collection": "singleton",
    "cql_udt_name": "geopoint",
    "properties": {
        "lat": {
            "type": "double",
            "cql_collection": "singleton"
        },
        "lon": {
            "type": "double",
            "cql_collection": "singleton"
        }
    }
}

As can be seen, the mapping does not produce a geo_point, but a lat / lon pair. This does not work when trying to do a distance search. It seems that when using 'discover' the mapping properties are ignored.

Upvotes: 1

Views: 175

Answers (2)

eranga
eranga

Reputation: 557

You can find a real example of using geo_point data type with elassandra from here

Upvotes: 0

Alexis Wilke
Alexis Wilke

Reputation: 20818

I ran in a similar problem with a different type/column.

The documentation says that when you use:

discover: '.*'

It discovers all the columns, but you can override with:

properties: {
  <column-name>: {
    type: '<specific-type>',
    cql_collection: 'singleton'
  }
}

The truth is that the override does not actually happen. (Tried with the latest, version 6.2.3.8 of Elassandra.)

I found out that it was much easier to just spell out all the columns and create my mapping that way and never use the discover: ... field.

However, I use the discover: ... once to get the default mappings and get an idea of what I should use in my own definitions. Often, though, it's somewhat wrong and I need to do small tweaks. Yet, in the end, it worked great for me once I removed the discover: '.*' line.

One thing that is very important is the "cql_collection": "singleton". Without it, the default is to create columns as set<> or list<> instead of plain int or text, etc. They say in the docs that this is because that's how Elasticsearch expects the data. I would imagine that having set<> or list<> is going to make things much more complicated and harder to read/write. I guess I'd have to test to see whether it works better or not with a corresponding benchmark test...

Upvotes: 2

Related Questions