Why Queue-First Delivery
Back in Stock traffic is bursty.
When popular products recover inventory, synchronous delivery quickly hits timeouts and provider limits.
This is one of those problems that looks small, then grows fast.
A queue-first pipeline keeps the system stable and recoverable.
It also makes failure handling much more predictable.
Recommended Job Lifecycle
Define explicit transitions:
pendingprocessingsent/failed-temporary/failed-permanentdead-letter
Only failed-temporary should be retried automatically.
Everything else should follow explicit rules.
Throughput Controls
Use multiple limits simultaneously:
- global messages-per-second
- per-domain caps
- tenant/workspace caps for multi-store setups
Queue ordering can be FIFO by subscription time to preserve fairness.
That kind of consistency matters more than it seems.
Retry Strategy
Classify failures and apply different behavior:
429, timeout: exponential backoff- transient provider errors: retry with jitter
- invalid recipient: mark
failed-permanentimmediately
When retry budget is exhausted, move to dead-letter and allow manual replay from admin UI.
This keeps the primary queue healthy while preserving recoverability.
Idempotency for Send Safety
Prevent duplicate sends with a deterministic key:
notificationKey = variantId + subscriberId + restockWindow- acquire/send atomically
- if already sent, mark as duplicate-suppressed
Template Version Traceability
Store template metadata in send logs:
- template ID
- template version
- resolved variables at send time
This makes post-incident investigation and compliance audits much easier.
And yes, those details usually matter later, not earlier.
Pipeline Observability
Track these core metrics:
- queue latency (enqueue to sent)
- temporary vs permanent failure rate
- retry recovery rate
- dead-letter backlog
Set alert thresholds for dead-letter growth and queue latency spikes.
Without alerts, problems stay invisible until users complain.
Summary
Back in Stock delivery should be treated as a resilient asynchronous system.
A queue-first model with clear state transitions, throttling, retries, and idempotency is the practical baseline for production.