Reputation: 183
We're using AWS AppSync with attached DynamoDB data sources. We've run into a really perplexing situation when attempting to filter queries before returning batch results to our clients. The goal is to filter our results based on substrings contained in the key being filtered.
Our DynamoDB has a composite key that looks like this:
nameGroup: String // partition key; the first letter of the sort key value
name: String // sort key; the full name of the object
Attributes:
locationID: String // a three-character string
officialName: String // a more formal name
... etc.
For example:
nameGroup: A
name: Australia
locationID: AUS
officialName: Australia
... etc.
And here you'll find our request resolver:
{
"version" : "2017-02-28",
"operation" : "Query",
"index" : "nameGroup-locationID-index",
"query" : {
## Query based off of first letter of supplied String **
"expression" : "nameGroup = :nameGroup",
"expressionValues" : {
":nameGroup" : $util.dynamodb.toDynamoDBJson(${ctx.args.filter.substring(0,1)})
}
},
"filter" : {
## Filter query list with 'contains' expression **
"expression" : "contains(#name, :name)",
"expressionNames" : {
"#name" : "name"
},
"expressionValues" : {
":name" : $util.dynamodb.toDynamoDBJson(${ctx.args.filter})
}
},
## Add 'limit' and 'nextToken' arguments to implement pagination **
"limit": $util.defaultIfNull(${ctx.args.count}, 3),
"nextToken": $util.toJson($util.defaultIfNullOrBlank(${ctx.args.nextToken}, null))
}
And our response resolver:
{
## Change default return field (items) to appropriate PaginatedCountries field **
"countryRefs": $util.toJson($ctx.result.items),
"nextToken": $util.toJson($util.defaultIfNullOrBlank($context.result.nextToken, null))
}
When we query with something like this:
getCountryList(filter: $filter) {
countryRefs {
name
locationID
officialName
}
nextToken
}
where the filter
var changes in value as the user inputs characters — e.g., $filter = A
, then $filter = Au
, then $filter = Aus
, etc. — we get very strange returns. In almost all cases, we seem to get something like this:
{
"data": {
"getCountryList": {
"countryRefs": [],
"nextToken": "eyJ2ZXJzaW9uIjoxLCJ0b2..." // a very long string token
}
}
}
Oddly enough, if we use nextToken
we'll find the results we're looking for in either the second or third page of results:
{
"data": {
"getCountryList": {
"countryRefs": [
{
"locationID": "AUS",
"name": "Australia"
},
{
"locationID": "AUT",
"name": "Austria"
}
],
"nextToken": "eyJ2ZXJzaW9uIjoxLCJ0b2..." // another very long string token
}
}
}
We've spent way too many hours thinking this was an issue with our filter expression (like could contains
be the issue or maybe begins_with
is the problem?). What we've noticed, though, is that if we change the limit
(either the default 3
or through a client-provided count
) to something generally larger than the expected size of the array of elements that would be returned in our query — that is, before the filter expression is applied to the results — the issue doesn't seem to exist.
For example, using filter: 'Au'
, if we set the default limit to 200
instead of 3
, we get exactly what we should be getting (there are only two country names beginning with 'Au')!
My question is this: why is the limit
apparently returning what I'm going to call arrays with "hidden" values? My guess is that the total return size is being returned with a bunch of empty values except for indices where filter matches are found. Either way, why aren't we getting returns in the expected way? Why is limit
filtering more than just the number of returns here — i.e., the way returns are actually structured?
Any help would be greatly appreciated!
Upvotes: 0
Views: 719
Reputation: 5751
That's expected behavior. What happens is that with DynamoDb, a filter expression is applied after a Query finishes, but before the results are returned. So basically it might be that your query returned results that were not relevant from the filter expression perspective and were subsequently filtered out and you get next tokens to retrieve more results. After following a couple of next tokens, you retrieve relevant results that you can display to the user.
From the DynamoDb perspective, a single Scan operation will read up to the maximum number of items set (if using the limit parameter which in your case is set to 3) or a maximum of 1 MB of data and then apply any filtering to the results using FilterExpression. When you set it to 200 that's why you get relevant results since you probably get all the existing countries and the filter expression is applied.
Upvotes: 0