Queue Semantics: JMS, AMQP, SQS
What This Concept Is
A classical message broker is a server that holds messages in queues and hands them out to consumers one at a time. The message is consumed when it is acknowledged and then it is gone. The broker is the authoritative bookkeeper of who got what.
Three canonical families share this model:
| Family | Broker examples | Wire protocol / API |
|---|---|---|
| JMS (Java Message Service) | ActiveMQ, HornetQ, Artemis, IBM MQ | Java API; brokers speak OpenWire, STOMP, AMQP |
| AMQP (Advanced Message Queuing Protocol) | RabbitMQ, Azure Service Bus | Wire-level AMQP 0-9-1 or 1.0 |
| SQS-style hosted queues | AWS SQS (standard + FIFO), Azure Storage Queues, Google Pub/Sub (partial) | HTTP API, managed |
They differ in wire format and admin model, but the semantic behavior is the same and is the thing you must understand.
Core Semantics
1. Delivery and acknowledgement
producer --> broker [msg] --> consumer
^ |
| v
+-- ACK --+ (msg removed from queue)
If consumer fails before ACK:
broker redelivers msg after visibility timeout / redelivery delay
- At-least-once by default. The broker re-delivers if an ack is missing. Consumers must be idempotent (Concept 12).
- The broker retains the message until acked. This is the big operational contract: the queue is the system of record for the pending task.
2. Visibility timeout / unacked retention
When a consumer receives a message, it becomes invisible to other consumers for a window (SQS: "visibility timeout"; RabbitMQ: "unacked"). If the consumer crashes or times out, the message reappears for another consumer.
- Set it longer than your worst-case processing time.
- Extend it explicitly for long tasks (
ChangeMessageVisibilityin SQS). - Too short -> duplicate processing; too long -> slow recovery from dead consumers.
3. Dead-letter queue (DLQ)
Messages that repeatedly fail (exceed a max-receive-count) get shunted to a dead-letter queue. DLQ is the single most important operational tool for keeping a queue healthy:
- without a DLQ: one poison message can block a FIFO queue or churn CPU forever
- with a DLQ: poison messages are parked for investigation, the main queue drains normally
You have not finished designing a queue until you have designed its DLQ handling.
4. Delivery order
Varies by broker:
| Broker | Default order | Strict-order option |
|---|---|---|
| RabbitMQ classic queue | Best-effort FIFO within queue | Single consumer, single queue |
| SQS standard | Unordered (expressly) | Use SQS FIFO (with message group IDs) |
| SQS FIFO | Per-group FIFO + exactly-once ingest | - |
| JMS queue | FIFO per queue | - |
| Azure Service Bus | FIFO with sessions | - |
When order matters, either choose a FIFO variant or partition the work by key so that per-key order is preserved (the same trick Kafka uses; see Concept 09).
Why It Matters Here
Queue-based brokers are the right substrate when:
- the workload is tasks, not events: "charge this card," "resize this image," "send this email"
- one consumer should perform each task (point-to-point, Concept 04)
- failures should be handled per-message (redelivery, DLQ, retry policies)
- you want strong operational features (delay queues, priority queues, per-message TTL)
They are the wrong substrate when:
- you want many independent consumers to read the same stream at their own pace (log-based brokers, Concept 08)
- you want to replay history from hours or days ago (queues typically purge on ack)
- you want a compact audit record; queues are transient buffers
Concrete Example: An SQS Worker Loop
while True:
msgs = sqs.receive_message(
QueueUrl=URL,
MaxNumberOfMessages=10,
WaitTimeSeconds=20, # long-polling
VisibilityTimeout=120,
)
for m in msgs:
try:
process(m.body)
sqs.delete_message(QueueUrl=URL, ReceiptHandle=m.receipt)
except PoisonError:
# do nothing; visibility will expire, DLQ will catch it
pass
except TransientError:
# extend so we finish the next attempt
sqs.change_message_visibility(...)
Things that are doing work silently:
- long-polling (
WaitTimeSeconds) collapses idle cost - visibility timeout gives you recovery against consumer crash
- the DLQ (configured at the queue level) catches messages whose receive-count exceeds a limit
delete_messageis the ack; missing it means redelivery
Common Confusion / Misconception
"SQS guarantees order." Only the FIFO variant, and only within a MessageGroupId. Standard SQS is explicitly unordered and duplicates are possible.
"Acknowledging is the same as 'done'." It is the broker's view of "done." If your consumer acks before the work completes, a crash mid-work loses the message with no redelivery.
"DLQs are for debugging; we will add one later." You will not. Poison messages will eat your queue before you add it.
"A queue is just a pipe." A queue is a durable buffer with state (unacked, visibility timeout, redelivery count, DLQ). Treating it as a stateless pipe leads to the common pathology of a "fine" queue that is secretly storing 40 GB of unacked messages.
"RabbitMQ is a queue; Kafka is a queue." RabbitMQ is a queue. Kafka is a log (Concept 08). The operational and semantic differences are large.
How To Use It
Per-queue design checklist:
- Purpose: tasks, events (with single consumer), or inbound to a DLQ-capable workflow?
- Delivery: at-least-once is the default. Plan consumer idempotency.
- Ordering: required? If yes, FIFO variant or per-key partitioning.
- Visibility timeout: longer than the 99th-percentile processing time, plus margin.
- DLQ: max-receives threshold, where DLQ messages go, who watches it.
- Throughput: consumer count, batch size, broker limits.
- Retention: message TTL if expiry matters; unacked retention if consumers are slow.
Check Yourself
- Why does at-least-once delivery force consumers to be idempotent? What happens if they are not?
- What is the correct relationship between visibility timeout and maximum processing time?
- Why is adding a DLQ before shipping production code almost always correct?
- Name one workload where SQS standard is appropriate and one where you should pick SQS FIFO.
Mini Drill or Application
Design the queue for an image-resize worker. In 15 minutes:
- Pick broker family and variant.
- Set visibility timeout with justification.
- Set max-receives and describe DLQ disposition.
- Describe the consumer loop including ack placement.
- Identify the idempotency key used to dedup a re-delivered resize request.
Transfer to Adjacent Domains
- Log-based brokers (Concept 08). Queues and logs solve different problems. Mapping one to the other produces subtly broken systems: using Kafka as a queue (no replay value, expensive per-message bookkeeping) or using RabbitMQ as a log (no replay, no multi-subscriber independence).
- Idempotency (Concept 12). At-least-once delivery is the reason consumer idempotency is non-negotiable. DLQs handle poison messages; idempotency handles duplicates. Both are required; neither is sufficient alone.
- Scalability engineering. Queue length and oldest-unacked-age are first-class SLIs. Autoscaling worker count on queue depth is the canonical use case for message-queue-driven HPA/KEDA.
- Serverless architecture. SQS + Lambda is the default point-to-point setup for serverless workflows; visibility timeout becomes the Lambda max timeout; DLQ is a Lambda config. The semantics in this concept map directly.
- Enterprise integration / JMS legacy. Many enterprises still run ActiveMQ/IBM MQ with decade-old message contracts. The semantics in Concept 07 are invariant across those brokers; the migration concerns are wire format and operational, not semantic.
Read This Only If Stuck
- Richards & Ford: Asynchronous Capabilities -- the async posture all three families share
- Richards & Ford: Preventing Data Loss -- durability and DLQ design
- Richards & Ford: Event-Driven Architecture Style -- where queues sit in broker topology
- System Design Primer: Asynchronism -- queue vs task-queue vs back-pressure framing
- System Design Primer: Availability patterns -- how queues absorb downstream outages
- AWS SQS Developer Guide -- canonical hosted-queue reference, especially visibility timeout and DLQ sections
- AWS SQS: Visibility timeout -- the single most commonly mis-tuned knob
- AWS SQS: Dead-letter queues -- how the DLQ is actually configured and monitored
- RabbitMQ: AMQP 0-9-1 Model -- the exchange/queue/binding model that underpins AMQP brokers
- Enterprise Integration Patterns: Dead Letter Channel -- canonical definition of DLQ semantics
- Enterprise Integration Patterns: Competing Consumers -- the work-distribution model queues exist to serve