Harts
Harts

Reputation: 4093

DynamoDB How to design and query multiple fields

I have an item like this

{
  "date": "2019-10-05",
  "id": "2",
  "serviceId": "1",
  "time": {
    "endTime": "1300",
    "startTime": "1330"
  }
}

Right now the way I design this is like so:

primary key --> id
Global secondary index --> primary key : serviceId
                       --> sort key : date

With the way I designed as of now,

* I can query the id
* I can query serviceId and range of date

I'd like to be able to query such that I can retrieve all items where

* serviceId = 1 AND
* date = "yyyy-mm-dd" AND
* time = {
           "endTime": "1300",
           "startTime": "1330"
         }

I'd still like to be able to query based on the 2 previous condition (query by id, and query by serviceId and rangeOfDate

Is there a way to do this? one way I was thinking is to create a new field and use it as index e.g: combine all data so combinedField: "1_yyyy-mm-dd_1300_1330

make that as primary key for global secondary index, and just query it like that.

I'm just not sure is this the way to do this or if there's a better or best practice way to do this?

Thank you

Upvotes: 3

Views: 8285

Answers (2)

Harish Nair
Harish Nair

Reputation: 189

You could either use FilterExpression or composite sort keys.

FilterExpression

Here you could retrieve the items from the GSI you described by using specifying 'serviceId' and 'date' and then giving within the 'FilterExpression' specifying time.startTime and time.endTime. The sample Python code using boto3 would be as follows:

response = table.query(
    KeyConditionExpression=Key('serviceId').eq(1) & Key('date').eq("2019-10-05"),
    FilterExpression=Attr(time.endTime).eq('1300') & Attr('time.startTime').eq('1330')
)

The drawback with this method is that all items specified with the sort key will be read and only then the results are filtered. So you will be charged according to what is specified in the sort key.

eg: if 1000 items have 'serviceId' as 1 and 'date' as '2019-10-05' but only 10 items have 'time.startTime' as 1330, then still you will be charged for reading the 1000 items even though only 10 items will be returned after the FilterExpression is applied.

Composite Sort Key

I believe this is the method you mentioned in the question. Here you will need to make an attribute as

'yyyy-mm-dd_startTime_endTime'

and use this as the sort key in your GSI. Now your items will look like this:

{   "date": "2019-10-05",
    "id": "2",
    "serviceId": "1",
    "time": {
        "endTime": "1300",
        "startTime": "1330"
    }
    "date_time":"2019-10-05_1330_1300"
}

Your GSI will have 'serviceId' as partition key and 'date_time' as sort key. Now you will be able to query date range as:

response = table.query(
    KeyConditionExpression=Key('serviceId').eq(1) & Key('date').between('2019-07-05','2019-10-05')
)

For the query where date, start and end time are specified, you can query as:

response = table.query(
    KeyConditionExpression=Key('serviceId').eq(1) & Key('date').eq('2019-10-05_1330_1300')
)

This approach won't work if you need range of dates and start and end time together ie. you won't be able to make a query for items in a particular date range containing a specific start and end time. In that case you would have to use FilterExpression.

Upvotes: 6

Itay Maman
Itay Maman

Reputation: 30723

Yes, the solution you suggested (add a new field which is the combination of the fields and defined a GSI on it) is the standard way to achieve that. You need to make sure that the character you use for concatenation is unique, i.e., it cannot appear in any of the individual fields you combine.

Upvotes: 0

Related Questions