Louisrr
Louisrr

Reputation: 155

How do I run geospatial queries at scale with NoSQL?

I am preparing to build an Android/iOS app that will require me to make complex polygon and containment geospatial queries. I like Apache Cassandra's no single point of failure, fault tolerance and data center awareness. Cassandra does not have direct support for geospatial queries (that I am aware of) but MongoDB and Couchbase Server do. MongoDB has scaling issues and I'm not sure if Couchbase would be a better alternative than Cassandra with Solr or Elasticsearch.

Would I be making a mistake by going with Datastax Enterprise (DSE), Cassandra and Elasticsearch over Couchbase Server? Will there be a noticeable difference in load times for web pages with the Cassandra/ES back end vs. Couchbase?

Upvotes: 11

Views: 9440

Answers (4)

Mnemaudsyne
Mnemaudsyne

Reputation: 513

Aerospike just released Server Community Edition 3.7.0, which includes Geospatial Indexes as a feature.

Aerospike can now store GeoJSON objects and execute various queries, allowing an application to track rapidly changing Geospatial objects or simply ask the question of “what’s near me”. Internally, we use Google’s S2 library and Geo Hashing to encode and index these points and regions. The following types of queries are supported:

  • Points within a Region
  • Points within a Radius
  • Regions a Point is in

This can be combined with a User-Defined Function (UDF) to filter the results – i.e., to further refine the results to only include Bars, Restaurants or Places of Worship near you – even ones that are currently open or have availability. Additionally, finding the Region a point is in allows, for example, an advertiser to figure out campaign regions that the mobile user is in – and therefore place a geospatially targeted advertisement. Internally, the same storage mechanisms are used, which enables highly concurrent reads and writes to the Geospatial data or other data held on the record. Geospatial data is a lot of fun to play around with, so we have included a set of examples based on Open Street Map and Yelp Dataset Challenge data.

Geospatial is an Experimental feature in the 3.7.0 release. It’s meant for developers to try out and provide feedback. We think the APIs are good, but in an experimental feature, based on the feedback from the community, Aerospike may choose to modify these APIs by the time this feature is GA. It’s not intended for Production usage right now (though we know some developers will go directly to Production ...)

Upvotes: 6

Alvin R
Alvin R

Reputation: 49

Aerospike provides a proven highly scalable NoSQL solution. Geospatial query has recently been added, and an Early Adopter release has just been announced. You might want to check that out.

Upvotes: 3

Chris Hinshaw
Chris Hinshaw

Reputation: 7275

Redis is probably one of the best alternatives. At the current time you would need to use Redis Unstable 3.2. The performance is oustanding. I have been using this with the lettuce java client and have seen incredible results. The larger the radius will decrease performance.

http://redis.io/commands/geohash

Upvotes: 2

rs_atl
rs_atl

Reputation: 8985

You are asking quite a few questions, as has been pointed out. The provided link offers one potential answer to how generic geospatial operations could be implemented using Cassandra. I'll offer one possible answer using straightforward out-of-the-box Cassandra constructs.

  1. Using geohashes (or quad trees), or something similar, create an index of geohashes and their associated polygons. The specific relationship and level(s) of precision are dependent on your data set and use case.

  2. To determine which polygons intersect with a given point or polygon, first compute its geohash(es), then look those geohashes up in the index. For general proximity, this may be sufficient. Either way, this narrows the potential intersection points down to a manageable set.

Upvotes: 0

Related Questions