Batch API rate limiting behavior is misleading and makes large migrations impractical
The current Batch API rate limit behavior is very confusing and appears inconsistent with both the documentation and response headers.
The API headers imply that batch requests count as a single request because:
x-ratelimit-remaining decreases only by 1 per batch request
the batch endpoint is specifically designed to combine up to 50 operations into one request
However, in practice, the backend rate limiter appears to count the inner operations inside the batch.
Example:
sending 4 batches with 50 operations each results in ~200 effective operations
this quickly triggers 429 responses despite the headers still showing many remaining requests
This creates a mismatch between:
what the headers/reporting suggest
and how the backend limiter actually behaves
As a result:
clients cannot properly throttle based on the provided headers
batching does not meaningfully improve throughput
in some cases batching is actually slower than sending requests individually
Additionally, around a month ago this behavior worked differently and batching behaved as expected/documented, so this appears to be either a recent backend change or a regression.
We also received an explanation from the support bot stating that the backend limiter counts the individual operations inside each batch request. However, even based on that explanation, the observed behavior still does not appear internally consistent.
If batches are effectively counted by inner operations, then sending:
3 batch requests
with 50 operations each
within the same minute
should already result in ~150 effective requests and therefore consistently trigger the limit.
However, in practice this is not what happens.
Sometimes we are able to send 4-5 such batch requests before receiving 429 responses, while other times throttling starts earlier. The threshold appears inconsistent and unpredictable.
This makes it extremely difficult to implement reliable client-side throttling because:
the response headers do not reflect actual backend accounting
the effective limits appear dynamic or opaque
and the observed behavior does not consistently match the documented or bot-explained logic
At the moment the integration effectively has to rely on trial-and-error delays rather than deterministic rate-limit handling.
Suggested improvements:
Make batch requests truly count as a single request against the limit
OR
Keep the current backend behavior, but make the rate limit headers reflect the actual inner operations inside batches
For example:
a batch with 50 operations should reduce x-ratelimit-remaining by 50
Real-world impact:
This becomes a major issue during large audience migrations and initial syncs.
For already-synced audiences this is manageable because syncing is incremental. However, when migrating stores or platforms, integrations often need to upload entire subscriber lists into new groups from scratch.
With the current behavior, syncing ~50k subscribers can take many hours, which creates a serious operational bottleneck for larger stores and migration projects.