Apricot
Apricot

Reputation: 3011

dynamodb : Scan vs Query using Python

I have a table in dynamodb with the following column elements:

clientId : Primary partition Key
timeId : Sort Key

clientId is to differentiate records of different clients and timeId is just a epoch timestamp linked to specific clientID. An example output of the table would look like this:

clientId             timeId              Bucket         dateColn
0000000028037c08     1544282940.0495     MyAWSBucket    1544282940
0000000028037c08     1544283640.119842   MyAWSBucket    1544283640

I am using the following code to fetch the records:

ap.add_argument("-c","--clientId",required=True,help="name of the client")
ap.add_argument("-st","--startDate",required=True,help="start date to filter")
ap.add_argument("-et","--endDate",required=True,help="end date to filter")
args = vars(ap.parse_args())

dynamodb = boto3.resource('dynamodb', region_name='us-west-1')

table = dynamodb.Table('MyAwsBucket-index')

response = table.query(
    KeyConditionExpression=Key('clientId').eq(args["clientId"]) and Key('timeId').between(args['startDate'], args['endDate'])
)

Essentially I am trying to subset the dynamodb first based on clientId and then followed by two timestamps - a start time and an end time. I could fetch all the records without the timestamps using the following:

KeyConditionExpression=Key('clientId').eq(args["clientId"])

However, when I include the startdate and time, I am getting the following error:

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the Query operation: Query condition missed key schema element: clientId

How do I resolve this and use both clientId as well as the start time and end time. I read that I could use scan but also read somewhere scan don't fetch the records quickly. Since my table has millions of rows, now sure if I should use scan. Can someone help?

Also my start time and end time search inputs are integers as given in dateColn as compared to float type as given in timeId. Not sure if that is creating any errors.

Upvotes: 1

Views: 1642

Answers (2)

Matt
Matt

Reputation: 11

An obvious issue with your query is that you are using and instead of & By using 'and' you are basically removing the first part of your query.

Upvotes: 1

Simrandeep Singh
Simrandeep Singh

Reputation: 547

I read that I could use scan but also read somewhere scan don't fetch the records quickly. Since my table has millions of rows, now sure if I should use scan.

DynamoDB scan is a very expensive operation as it reads all the documents thereby consuming lot of the provisioned throughput. Hence scan should be refrained as much as possible to query the table.

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the Query operation: Query condition missed key schema element: clientId

This error implies that the value of partition key clientId is not specified in the query. This is a bit confusing as the value may indeed be non-empty but it might mean that the partition key is expecting number but args["clientId"] is a string which is not acceptable. Please refer this documentation for how to specify the intended data type of the arguments.

Upvotes: 2

Related Questions