RabbitMQ 4.2 vs Kafka 4.2 (KRaft) vs NATS 2.12 (JetStream)
I started using NATS in one of my projects and was generally happy with it, but I wanted to verify the performance claims for myself. Is it really as fast as people say, or is that just marketing and cherry-picked benchmarks? The best way to find out was to write my own tests and compare NATS against the two most common alternatives: RabbitMQ and Kafka.
This post covers throughput testing of all three brokers on two messaging patterns: async producer-consumer queue, and request-reply. Request-reply is not the typical use case for message brokers, but NATS supports it natively, so it was worth measuring how the others perform when forced into that pattern.
All three brokers ran in Docker containers on the same host. No custom tuning was applied to any broker: default configurations only.
| Broker | Docker Image | Configuration |
|---|---|---|
| RabbitMQ | rabbitmq:4.2-management | Default settings, AMQP 0.9.1 |
| Kafka | apache/kafka:4.2.0 | KRaft mode, single node, 1 partition |
| NATS | nats:2.12-alpine | JetStream enabled (-js) |
Measured via docker stats on freshly started containers with no accumulated data or active connections.
Kafka's JVM-based architecture is immediately visible: 54x the memory of NATS and 2.7x of RabbitMQ on cold start. NATS is the lightest at 6 MiB.
InvocationCount = 1, UnrollFactor = 1 (each iteration is a single benchmark call)RunStrategy = MonitoringThe message counts and payload sizes were chosen to cover two dimensions: the number of concurrent messages the broker must route, and the size of individual payloads. Counts are inversely proportional to payload size to keep total benchmark runtime within a few minutes per scenario while still loading the broker enough to reveal its throughput characteristics.
| Messages | Payload | Total Volume |
|---|---|---|
| 50,000 | 256 B | 12.8 MB |
| 25,000 | 1 KB | 25 MB |
| 10,000 | 4 KB | 40 MB |
| 5,000 | 64 KB | 327 MB |
| 2,500 | 128 KB | 335 MB |
| Messages | Payload | Total Volume |
|---|---|---|
| 25,000 | 256 B | 6.4 MB |
| 10,000 | 1 KB | 10 MB |
| 5,000 | 4 KB | 20 MB |
The async pattern uses more publishers (250 vs 150) and reaches larger payloads because bulk throughput is the primary concern. Request-reply uses fewer messages and smaller payloads reflecting the typical RPC use case where latency matters more than volume.
All three implementations follow the same structure: N publishers concurrently push messages into a queue/topic/stream, one consumer reads everything. The benchmark measures wall-clock time from the first publish to the last received message.
RabbitMQ (RabbitMQ.Client v7.2.1): persistent messages, QoS prefetch = 100, manual ACK, separate connections for publisher and consumer.
Kafka (Confluent.Kafka v2.13.2): idempotent producer, 1 GB write buffer, manual offset commit, single partition, consumer group ID randomized per iteration.
NATS JetStream (NATS.Net v2.7.3): file-backed stream, workqueue retention, async persistence, explicit ACK, 1 GB writer buffer.
NATS has native request-reply: RequestAsync sends a message and returns a response in a single call. The broker handles response routing internally.
RabbitMQ and Kafka lack this primitive. For both, request-reply was implemented via correlation IDs:
TaskCompletionSource in a ConcurrentDictionaryEach "request" in RabbitMQ/Kafka involves 4 broker operations (publish request, consume request, publish reply, consume reply) vs 1 round-trip in NATS.
All values are P95 (95th percentile) completion time in milliseconds. Lower is better. Ratio columns show time relative to NATS JetStream (baseline).
| Scenario | RabbitMQ | Kafka | NATS JS | Rabbit / NATS | Kafka / NATS |
|---|---|---|---|---|---|
| 50K x 256 B | 1,521 | 35,856 | 944 | 1.61 | 38.98 |
| 25K x 1 KB | 905 | 18,629 | 511 | 1.77 | 36.46 |
| 10K x 4 KB | 442 | 8,329 | 256 | 1.73 | 32.54 |
| 5K x 64 KB | 534 | 7,496 | 878 | 0.61 | 8.54 |
| 2.5K x 128 KB | 690 | 7,162 | 735 | 0.94 | 9.74 |
| Scenario | RabbitMQ | Kafka | NATS JS |
|---|---|---|---|
| 50K x 256 B | 32,873 | 1,394 | 52,966 |
| 25K x 1 KB | 27,624 | 1,342 | 48,924 |
| 10K x 4 KB | 22,624 | 1,201 | 39,063 |
| 5K x 64 KB | 9,363 | 667 | 5,695 |
| 2.5K x 128 KB | 3,623 | 349 | 3,401 |
On small to medium payloads (up to 4 KB), NATS JetStream processes messages 1.6-1.8x faster than RabbitMQ at P95. On large payloads (64 KB+), RabbitMQ takes the lead at 61-94% of NATS's time. RabbitMQ allocates 7-12 MB managed memory for large payloads, while NATS allocates 368-401 MB. AMQP framing is more efficient for large contiguous payloads.
Kafka is 9-38x slower than NATS at P95. This is expected: Kafka's commit log architecture adds overhead that only pays off with horizontal scaling across multiple partitions and nodes.
| Scenario | RabbitMQ | Kafka | NATS JS |
|---|---|---|---|
| 50K x 256 B | 106 MB | 115 MB | 678 MB |
| 25K x 1 KB | 54 MB | 76 MB | 342 MB |
| 10K x 4 KB | 22 MB | 60 MB | 205 MB |
| 5K x 64 KB | 12 MB | 323 MB | 401 MB |
| 2.5K x 128 KB | 7 MB | 318 MB | 368 MB |
RabbitMQ consistently uses the least managed memory. NATS allocates significantly more due to the 1 GB writer buffer configuration. Kafka's allocations spike with large payloads (318-323 MB) due to its own producer buffer (QueueBufferingMaxKbytes = 1 GB).
All values are P95 completion time. Ratio columns show time relative to NATS (baseline).
| Scenario | RabbitMQ | Kafka | NATS | Rabbit / NATS | Kafka / NATS |
|---|---|---|---|---|---|
| 25K x 256 B | 41,450 | 36,572 | 397 | 104.41 | 92.12 |
| 10K x 1 KB | 21,434 | 15,113 | 226 | 94.84 | 66.87 |
| 5K x 4 KB | 12,231 | 7,339 | 159 | 76.92 | 46.16 |
| Scenario | RabbitMQ | Kafka | NATS |
|---|---|---|---|
| 25K x 256 B | 603 | 684 | 62,972 |
| 10K x 1 KB | 467 | 662 | 44,248 |
| 5K x 4 KB | 409 | 681 | 31,447 |
NATS is 46-92x faster than Kafka and 77-104x faster than RabbitMQ at P95. This is the difference between a native protocol primitive (one network round-trip) and an application-level emulation (four broker operations per request).
RabbitMQ is the slowest in all request-reply scenarios, with P95 degrading linearly: 12.2s for 5K messages, 21.4s for 10K, 41.4s for 25K. The per-message overhead is roughly constant at ~1.7 ms, dominated by the ACK cycle on both request and reply queues.
Kafka also shows high tail latency: P95 reaches 36.6s on the 25K scenario (Mean is 23.3s), indicating consumer group coordination and offset management overhead amplified in what is effectively a synchronous request pattern.
-js) enables persistencemicro packageFor new projects that need a general-purpose message broker, NATS is the most practical starting point.
It provides a feature set comparable to Kafka: persistence with replay, exactly-once delivery, stream processing primitives, key-value and object stores. At the same time, its throughput on small-to-medium payloads matches or exceeds RabbitMQ, and it handles request-reply 46-105x faster than either alternative at P95 thanks to native protocol support.
The operational cost is also lower. A single binary with one flag gives you a persistent, JetStream-enabled broker consuming 6 MiB of RAM on cold start. Compare that to Kafka's 327 MiB.
RabbitMQ remains a strong choice when the workload is primarily large payloads (64 KB+) or when the team has deep AMQP expertise. Kafka is still the right tool for large-scale event streaming, CDC pipelines, and scenarios where partition-based parallelism and the Connect/Streams ecosystem matter.
But as a default choice for a new distributed system? NATS delivers Kafka-class features at RabbitMQ-class speed, with less operational overhead than either.