Reputation: 326
We're using debezium via AWS MSK serverless and MSK Connect to monitor the binlog of RDS Aurora MySQL. For 99% of our data this all works fine, but very occasionally debezium fails with:
[Worker-0ee2268e7e753a81a] org.apache.kafka.common.errors.RecordTooLargeException: The message is 12472229 bytes when serialized which is larger than 8388608, which is the value of the max.request.size configuration.
MSK serverless has a max message size of 8mb so the only way we can work around this is to update the debezium offset to skip the binlog event, which is time consuming and problematic.
When googling for this error I find lots of advice on how to increase the brokers max message size, which won't work for us in this case since it is fixed by AWS.
We have already tried:
event.processing.failure.handling.mode=warn
connector.client.config.override.policy=ALL
producer.override.max.request.size=8388608
Is there a way to ignore/resize oversized binlog events when using debezium/kafka connect?
Upvotes: 2
Views: 555
Reputation: 4948
One generic way to approach this could be to use a SMT (use the raw byte size perhaps to filter). You can handroll one yourself to drop a record There are some preset filters available (not size based, but if there a way may be you can extract some field from source that will have a large message , that can be a start) , something similar to : https://docs.confluent.io/platform/current/connect/transforms/filter-ak.html
Upvotes: 0