Reputation: 153
I have data that has multiple dimensions, each of which are strings. For example, a Person
is described by position
, id
, email
, etc...
I want to use one piece of multi-dimensional datum as a key into my NoSQL database. I don't need to do any complex querying, just periodic full table scans (the table will be small). What are some ways / best practices to format this data as a key?
I have considered colon delimiting (i.e. position:id:email
) but it has hard readability and low flexibility. I've also considered hashing this colon-delimited string. Is there a good hash function for this type of thing? Or any completely other suggestions?
Thanks in advance!
Upvotes: 0
Views: 270
Reputation: 11
Storing multi-dimensional data under a one-dimensional key is a challenging task in key-value-stores / NoSQL databases. Projects like MD-HBase or GeoMESA do exist; they place the multi-dimensional data into an n-dimensional space and use a space-filling curve to encode the location of the data into a one-dimensional key. However, most projects are limited to 2-dimensional spatial data, and string attributes could not be handled.
Shameless Plug: I have started a new open-source-project called BBoxDB. BBoxDB is a distributed storage manager that is capable of handling multi-dimensional data. In BBoxDB a bounding box is used to describe the location of multi-dimensional data in the n-dimensional space. You could map the string attributes of your Person
entity to a point in the n-dimensional space and use this point as the bounding box for your data. Then BBoxDB can run queries on your data (e.g., full table scans or scans that are restricted to some dimensions). The project is at an early stage, but maybe it is interesting for you.
Upvotes: 1