tiga05
tiga05

Reputation: 13

Best way to do begins_with query on partition key and sort key on dynamodb

I'm currently diving into DynamoDB and I'm struggling with a conceptual problem. The problem, simplified in SQL terms, is as follows: I have a table with two columns. The first column contains guaranteed unique entries and should therefore ideally be used for the partition key. I want to perform a begins_with search on the two columns. The rows that meet the condition should be returned to me as a list.

In SQL language, the query would be something like this:

Search string: AAB

SELECT * FROM my_table WHERE column1 LIKE 'AAB%' OR column2 LIKE 'AAB%';

How should I structure my DynamoDB table (partition key? sort key? possibly additional GSI?) to implement this query (and thats the only query I will do on this table!) efficiently and without scanning?"

My current research told me I can not execute begins_with queries on the partition key. So this key is pretty useless for my usecase. And it also looks like I can not do a begins_with operation on the sort key without giving a pk. So it looks like I need to specify two GSIs and set column1 and column2 as GSIPK on each of this indexes, to be able to do begins_with operations on that. And even then it looks like I need to execute two queries. One for the column1 and one for the column2.

In my opinion, it seems like I'm using DynamoDB incorrectly or at least in a way its not made for.

Is there a better way to achieve that?

Upvotes: 0

Views: 648

Answers (1)

tiga05
tiga05

Reputation: 13

For now I solved the problem in the following way: In my application, I need to search not only for this entity with around 2.000 entries, I also need to search for other entities of other types. But its always the same pattern: "Try to find something which begins_with the following: $searchTerm"

So I decided to create a GSI "SearchIndex" specifically for this kind of search operations. the pk (called pk_type) here is the type of the entity. For example if you search for cities, you would choose "city" as pk_type. The sort key (sk_type) would be the search term, f.e. a concrete city like "Berlin". To fit the "OR column2 LIKE" term (like written in my question), I decided to insert another identical entry of the entity except with the string from column2 in the sort key. One single request on the pk_type + sk_type will search on both columns now. They share the same main pk of the table but a different main sk. So they have some kind of relation and can be retrieved/deleted/updated via the same pk.

Or to write it in specific Typescript AWS CDK code:

  var table = new TableV2(this, 'MYTABLE', {
  partitionKey: { name: 'pk', type: AttributeType.STRING },
  sortKey: { name: 'sk', type: AttributeType.STRING },
  globalSecondaryIndexes: [
    {
      indexName: 'SearchIndex',
      partitionKey: { name: 'pk_type', type: AttributeType.STRING },
      sortKey: { name: 'sk_type', type: AttributeType.STRING },
    }
  ]
});

Another option would have been to use the main pk/sk for the search operation. But I think this will limit the way I can integrate other query patterns in the future and with this approach I am flexible to add any entry to a group of searchable entities at any time.

Upvotes: 0

Related Questions