ryan
ryan

Reputation: 91

How to query dynamoDb by one to many fields in no specific order?

I'm trying to think of the best way to design a table for my database when I need a search screen.

Let's say I have a search screen of lastName, ssn, firstName, address.

I want a user to be able to search one to many of these fields, in every possible combination. Could be lastName only, ssn and address, etc. Some of these fields could be null in the database.

It seems this is very complicated without a relational database.

  1. I could either create a GSI for each field, query all four of them, and filter/combine results programatically.
  2. I could create a composite key, but that gets messy when you're searching in any possible order
  3. I could scan the whole table and filter things out myself

I can have tens of thousands of records at any given time. Each record contains a field with a very large JSON file in it that represents every UI field I have in my SPA, so each individual result could be a bit large. Although I suppose I don't need the JSON file returned on that query; just fields needed to populate a results table so the user can click a row, and then we look up JSON from there.

What would be the most performant, preferably inexpensive way to accomplish my needs using DynamoDB? Is there a simpler way other than my 3 proposed solutions? I previously used MongoDB queries in a DocumentDB, and could use AggregationOperations to do the search and Project a response. But, documentDB does not support an active-active global configuration, so I don't want to use it anymore for failover purposes.

Upvotes: 0

Views: 80

Answers (1)

M R
M R

Reputation: 1

Given there are 4 fields there are 16 possible query combinations. However, many of these combinations are redundant because there are fields with low cardinality. For example, any search that includes SSN will return one result or none. So it's sufficient to have one index by SSN to support all such queries. A query by SSN only will run on the index. A query with SSN+First Name will return the record with that SSN (if exists) and the application can further check if the returned First Name from the record matches the user-specified input.

Similar logic applies to queries by Address. Since Address is a low cardinality field, it's sufficient to have an index by Address. Any queries involving Address then can be post-processed by application to filter by other criteria.

This leaves us with First Name and Last name, which should support combinations.

Here's how a DynamoDB schema might look like. The schema requires only 3 additional indexes.

Read-optimized schema
| :table-cnt | :table |     :pk |    :sk | :entity |
|------------+--------+---------+--------+---------|
|       1000 |   MAIN |     ssn |        |  Person |
|       1000 |   GSI1 |  f.name |        |  Person |
|       1000 |   GSI2 |  l.name | f.name |  Person |
|       1000 |   GSI3 | address |        |  Person |

Queries    
|     :query | :query-tbl | :query-cost |
|------------+------------+-------------|
|     by ssn |       MAIN |           1 |
|  by f.name |       GSI1 |         100 |
|  by l.name |       GSI2 |          10 |
|  f.+l.name |       GSI2 |           5 |
| by address |       GSI3 |           5 |

Note: This schema is autogenerated by a tool (not a generative AI). I made certain assumptions about data distribution that may not apply to your specific case, so consider your data characteristics when adapting this schema..

Upvotes: 0

Related Questions