Skip to main content

Queue Semantics: JMS, AMQP, SQS

What This Concept Is

A classical message broker is a server that holds messages in queues and hands them out to consumers one at a time. The message is consumed when it is acknowledged and then it is gone. The broker is the authoritative bookkeeper of who got what.

Three canonical families share this model:

FamilyBroker examplesWire protocol / API
JMS (Java Message Service)ActiveMQ, HornetQ, Artemis, IBM MQJava API; brokers speak OpenWire, STOMP, AMQP
AMQP (Advanced Message Queuing Protocol)RabbitMQ, Azure Service BusWire-level AMQP 0-9-1 or 1.0
SQS-style hosted queuesAWS SQS (standard + FIFO), Azure Storage Queues, Google Pub/Sub (partial)HTTP API, managed

They differ in wire format and admin model, but the semantic behavior is the same and is the thing you must understand.

Core Semantics

1. Delivery and acknowledgement

producer --> broker [msg] --> consumer
^ |
| v
+-- ACK --+ (msg removed from queue)

If consumer fails before ACK:
broker redelivers msg after visibility timeout / redelivery delay
  • At-least-once by default. The broker re-delivers if an ack is missing. Consumers must be idempotent (Concept 12).
  • The broker retains the message until acked. This is the big operational contract: the queue is the system of record for the pending task.

2. Visibility timeout / unacked retention

When a consumer receives a message, it becomes invisible to other consumers for a window (SQS: "visibility timeout"; RabbitMQ: "unacked"). If the consumer crashes or times out, the message reappears for another consumer.

  • Set it longer than your worst-case processing time.
  • Extend it explicitly for long tasks (ChangeMessageVisibility in SQS).
  • Too short -> duplicate processing; too long -> slow recovery from dead consumers.

3. Dead-letter queue (DLQ)

Messages that repeatedly fail (exceed a max-receive-count) get shunted to a dead-letter queue. DLQ is the single most important operational tool for keeping a queue healthy:

  • without a DLQ: one poison message can block a FIFO queue or churn CPU forever
  • with a DLQ: poison messages are parked for investigation, the main queue drains normally

You have not finished designing a queue until you have designed its DLQ handling.

4. Delivery order

Varies by broker:

BrokerDefault orderStrict-order option
RabbitMQ classic queueBest-effort FIFO within queueSingle consumer, single queue
SQS standardUnordered (expressly)Use SQS FIFO (with message group IDs)
SQS FIFOPer-group FIFO + exactly-once ingest-
JMS queueFIFO per queue-
Azure Service BusFIFO with sessions-

When order matters, either choose a FIFO variant or partition the work by key so that per-key order is preserved (the same trick Kafka uses; see Concept 09).

Why It Matters Here

Queue-based brokers are the right substrate when:

  • the workload is tasks, not events: "charge this card," "resize this image," "send this email"
  • one consumer should perform each task (point-to-point, Concept 04)
  • failures should be handled per-message (redelivery, DLQ, retry policies)
  • you want strong operational features (delay queues, priority queues, per-message TTL)

They are the wrong substrate when:

  • you want many independent consumers to read the same stream at their own pace (log-based brokers, Concept 08)
  • you want to replay history from hours or days ago (queues typically purge on ack)
  • you want a compact audit record; queues are transient buffers

Concrete Example: An SQS Worker Loop

while True:
msgs = sqs.receive_message(
QueueUrl=URL,
MaxNumberOfMessages=10,
WaitTimeSeconds=20, # long-polling
VisibilityTimeout=120,
)
for m in msgs:
try:
process(m.body)
sqs.delete_message(QueueUrl=URL, ReceiptHandle=m.receipt)
except PoisonError:
# do nothing; visibility will expire, DLQ will catch it
pass
except TransientError:
# extend so we finish the next attempt
sqs.change_message_visibility(...)

Things that are doing work silently:

  • long-polling (WaitTimeSeconds) collapses idle cost
  • visibility timeout gives you recovery against consumer crash
  • the DLQ (configured at the queue level) catches messages whose receive-count exceeds a limit
  • delete_message is the ack; missing it means redelivery

Common Confusion / Misconception

"SQS guarantees order." Only the FIFO variant, and only within a MessageGroupId. Standard SQS is explicitly unordered and duplicates are possible.

"Acknowledging is the same as 'done'." It is the broker's view of "done." If your consumer acks before the work completes, a crash mid-work loses the message with no redelivery.

"DLQs are for debugging; we will add one later." You will not. Poison messages will eat your queue before you add it.

"A queue is just a pipe." A queue is a durable buffer with state (unacked, visibility timeout, redelivery count, DLQ). Treating it as a stateless pipe leads to the common pathology of a "fine" queue that is secretly storing 40 GB of unacked messages.

"RabbitMQ is a queue; Kafka is a queue." RabbitMQ is a queue. Kafka is a log (Concept 08). The operational and semantic differences are large.

How To Use It

Per-queue design checklist:

  1. Purpose: tasks, events (with single consumer), or inbound to a DLQ-capable workflow?
  2. Delivery: at-least-once is the default. Plan consumer idempotency.
  3. Ordering: required? If yes, FIFO variant or per-key partitioning.
  4. Visibility timeout: longer than the 99th-percentile processing time, plus margin.
  5. DLQ: max-receives threshold, where DLQ messages go, who watches it.
  6. Throughput: consumer count, batch size, broker limits.
  7. Retention: message TTL if expiry matters; unacked retention if consumers are slow.

Check Yourself

  1. Why does at-least-once delivery force consumers to be idempotent? What happens if they are not?
  2. What is the correct relationship between visibility timeout and maximum processing time?
  3. Why is adding a DLQ before shipping production code almost always correct?
  4. Name one workload where SQS standard is appropriate and one where you should pick SQS FIFO.

Mini Drill or Application

Design the queue for an image-resize worker. In 15 minutes:

  1. Pick broker family and variant.
  2. Set visibility timeout with justification.
  3. Set max-receives and describe DLQ disposition.
  4. Describe the consumer loop including ack placement.
  5. Identify the idempotency key used to dedup a re-delivered resize request.

Transfer to Adjacent Domains

  • Log-based brokers (Concept 08). Queues and logs solve different problems. Mapping one to the other produces subtly broken systems: using Kafka as a queue (no replay value, expensive per-message bookkeeping) or using RabbitMQ as a log (no replay, no multi-subscriber independence).
  • Idempotency (Concept 12). At-least-once delivery is the reason consumer idempotency is non-negotiable. DLQs handle poison messages; idempotency handles duplicates. Both are required; neither is sufficient alone.
  • Scalability engineering. Queue length and oldest-unacked-age are first-class SLIs. Autoscaling worker count on queue depth is the canonical use case for message-queue-driven HPA/KEDA.
  • Serverless architecture. SQS + Lambda is the default point-to-point setup for serverless workflows; visibility timeout becomes the Lambda max timeout; DLQ is a Lambda config. The semantics in this concept map directly.
  • Enterprise integration / JMS legacy. Many enterprises still run ActiveMQ/IBM MQ with decade-old message contracts. The semantics in Concept 07 are invariant across those brokers; the migration concerns are wire format and operational, not semantic.

Read This Only If Stuck