Reputation: 476
I am attempting to filter a paginated scan request to a dynamodb table.
While a traditional scan filter would use something like the following:
response = table.scan(
FilterExpression=Attr("Date").eq(date)
)
Attempting to pass this to the paginator scan however:
dynamodb = boto3.client('dynamodb')
dynamores= boto3.resource('dynamodb')
table = dynamores.Table('table')
pagination_config = {"MaxItems": 2, "PageSize": 2}
paginator = dynamodb.get_paginator('scan')
response_iterator = paginator.paginate(
TableName=table.table_name,
PaginationConfig=pagination_config,
FilterExpression=Attr("Email").contains(user) &
Attr("Category").contains(types)
)
Results in a parameter validation error
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid type for parameter FilterExpression, value: <boto3.dynamodb.conditions.And object at 0x7f550f4b1940>, type: <class 'boto3.dynamodb.conditions.And'>, valid types: <class 'str'>
I have tried filtering on different keys without success and the response remains the same, even when filtering on boolean or integer columns. Is there a way to filter the scan using the operation_parameters argument?
I have also attempted using JMES filtering without success as the followiing:
response_iterator = paginator.paginate(
TableName=table.table_name,
PaginationConfig=pagination_config
)
filtered_iterator = response_iterator.search("Contents[?contains(Category, 'File Status')]")
Yields no results despite there being an entry matching this criteria in the table.
Upvotes: 1
Views: 2020
Reputation: 476
Two possible solutions identified that I will leave here for anyone attempting to filter / search on paginated DynamoDB responses as the Boto3 documentation is frankly useless.
Option 1:
Drill down into the response object using JMES on the dynamodb specific response object:
response object = {
"Items":[
{
"Date":{
"N":"54675846"
},
"Message":{
"S":"Message02"
},
"_id":{
"S":"4567"
},
"Category":{
"S":"Validation Error"
},
"Email":{
"S":"[email protected]"
}
}
]}
filtered_iterator = response_iterator.search("Items[?Category.S == 'Validation Error'].{_id: _id, Category: Category, Date: Date, IsRead: IsRead, Message: Message, Email: Email}")
You can then iterate over the response object pages as usual with a standard for loop.
Option 2:
Use list comprehension on the paginated response object and filter at that point. Note that this will be the more expensive option as it will still scan and return the max items within your paginator configutration:
for page in response_iterator:
page["Items"] = [x for x in page["Items"] if '{search_term}' in x['Category']['S']]
Option 3:
By far the simplest solution is that identified by Lee Harrigan.
Upvotes: 0
Reputation: 19793
You are treating the paginator like the Resource client which it is not. You need to be more low level like this example:
Upvotes: 2