Zane
Zane

Reputation: 11

MailChimp Batch operations API batches are getting stuck

Context of the problem

We use MailChimp Batch Operations API which is for running multiple Mailchimp api calls on an organisation in batches up to 10,000. Mailchimp will process all of the operations and then notify your webhooks when it is finished.

We use these to get Email Marketing activity (sends, opens, and clicks) as part of a data sync for about 25 different mailchimp customers who want to sync that activity to a different application. They have a developer guide detailing how to use batch operations here: https://mailchimp.com/developer/marketing/guides/run-async-requests-batch-endpoint/

Of context to this discussion is there table of possible statuses for batch operations:

state Description
pending Processing on the batch operation has not started.
preprocessing The batch request is being broken up into smaller operations to speed up processing.
started Processing has started.
finalizing Processing is complete, but the results are being compiled and saved.
finished Processing is done. You can now retrieve the results from the URL in response_body_url.

The Problem

About a month ago, we started seeing some of our customers with batches getting stuck in the preprocessing status and then never completing. (I'm pretty sure mailchimp deletes any outstanding batches 14 days after they were created). There does not seem to be a consistent surfacing of this issue across our customers who use various mailchimp usX api regions, and are on various tiers of the product. Additionally, the issue is not consistent across the size of the customers subscriber base.

There is no information on how and when batch operations might get throttled by mailchimp. I'm wondering if anyone has seen any similar issues using batch operations? Have you found a way to get more reliable processing?

Attempted fix 1

We used to create batches of 10,000 if there were 10,000 items to process. But recently capped our batch size at 2,000 which seems to have sped up processing times as mailchimp will work to a higher level of parallelisation. A batch of 10,000 operations could take 7-12 hours for mailchimp to process, but 2,000 in 2-3 hours meaning we get responses much faster. However, we still sometimes get batches stuck in preprocessing.

Alternative solution 1 that does not work for us.

We wanted to switch off of using batch operations entirely because of how unreliable they have become. However, the only endpoint we have found for getting marketing activity is indiviulized to the list and contact view recent 50 member activity which requires 1 api call for each contact for each list of interest. Which means we easily exceed api limits if we want to get all activity for an account.

Did Mailchimp nerf batch operations last year?

This is a graph of invocations to our webhook by mailchimp over the past 15 months. Over the past two months you can see our invocations decreasing as we rolled out changes to poll mailchimp for this activity every 24 hours instead of every 4 hours. This was to respond to our hypothesis that mail chimp was silently throttling some batches.

However, you can see a much larger drop in invocations in September of last year. We did not have any changes to our application or customer base around this time. We think that our issue may have begun around this time. Is anyone able to find announcements of mailchimp making changes to these endpoints around that time? graph of our webhook being invoked by mailchimp over the past 15 months

Upvotes: 0

Views: 102

Answers (1)

In the context i work, we did felt some slowness in September/October. Our batch has the size a slightly larger than yours, and we execute 5 batches in parallel. In good days, it takes 7 hours and a half to reach finalized state.

While we are not getting stuck in preprocessing in recent executions (during this month), we do get stuck in finalizing state. If the process stays X hours in this state, we delete the batch and call it a timed-out process. The same was done in preprocessing state if it stays too long there. Thats all we can do about that so far to make sure we dont get to a never ending process.

We did not had from them any info regarding endpoint updates.

When you said:

I'm pretty sure mailchimp deletes any outstanding batches 14 days after they were created

Do you have any documentation on that? I did not find anything related to that in mailchimp documentation.

And do you know technically what happens in preprocessing status? I am trying to decipher what is 'smaller operations' here.

Sorry to present more questions than answers :)

Upvotes: 0

Related Questions