Skip to main content Link Menu Expand (external link) Document Search Copy Copied

BlazingMQ Performance

Introduction

Benchmarking any complex distributed system like BlazingMQ can be tricky because there are several moving parts - there is no one latency number for BlazingMQ, and any performance numbers are affected by factors including but not limited to:

  • Size of the BlazingMQ cluster and replication factor

  • Storage (no persistence, in-memory, HDD, SSD, etc)

  • Hardware and network capabilities

  • Producer message rate, message size and batch size

  • Producer publishing pattern (smooth vs bursty)

  • Number of consumers (fan-out ratio)

  • Network topology (location of producers/consumers, cluster nodes, etc) which has direct impact on ping latency among nodes

  • BlazingMQ message broker configuration (number of threads, initial size of various memory pools, etc)

Benchmarking Setup

We provide results of a recent benchmark of BlazingMQ with this setup:

  • Six nodes in the BlazingMQ cluster

  • Each node with local SSDs attached to it

  • Nodes geographically spread across such that ping latency between them was around 1.5 milliseconds

  • Producer and consumer applications were running on other nodes and instead of connecting directly to the BlazingMQ cluster, they connected to BlazingMQ proxies (see Alternative Deployment for details about BlazingMQ proxies).

  • Queues storage replication was configured for strong consistency i.e., primary node waited for acknowledgement from enough replicas such that the majority of the nodes recorded the message before returning success to the producer applications. So in this case, primary waited for acknowledgement from three replica nodes before replying to the producer. This ensured that a total of four nodes (primary and three replicas) recorded the message.

  • Queues with various routing strategies were tested.

It is worth mentioning that the cluster setup described above is an extreme one and the geographical distance between cluster nodes becomes a non-negligible factor in the final results. We provide some benchmarking results from a more friendly setup later in the article.

How to interpret the tables below:

  • Latency number indicates the difference between the time a message was sent to the queue by the producer and the time it was received by the consumer. All latency numbers are in milliseconds.

  • Message size was 1KB and compression was disabled.

  • In the Scenario column, Q, P and C letters represent a queue, a producer and a consumer respectively. For example:
    • 1Q, 1P, 1C in the first column means one producer, one consumer and one queue.
    • 10Q, 10P, 10C means 10 queues, 10 producers and 10 consumers, such that there is one producer and one consumer for each queue.
    • 1Q, 1P, 5C means one queue with one producer and five consumers, and each consumer receiving every message posted on the queue.
  • Second column, “Total Produce Rate” indicates the total produce rate accumulated across all producers running in that scenario.

  • Similarly, “Total Consume Rate” indicates the total consume rate accumulated across all consumers running in that scenario.

Priority Routing Strategy

ScenarioTotal Produce Rate (msgs/sec)Total Consume Rate (msgs/sec)Median (p50)Averagep90p99
1Q, 1P, 1C60,00060,0004.55.47.319.8
10Q, 10P, 10C120,000120,0003.912.06.8201.9
50Q, 50P, 50C100,000100,0003.97.38.787.9
100Q, 100P, 100C100,000100,0004.712.420.3165.6

Fan-out Routing Strategy

ScenarioTotal Produce Rate (msgs/sec)Total Consume Rate (msgs/sec)Median (p50)Averagep90p99
1Q, 1P, 5C20,000100,0004.44.74.88.2

Broadcast Routing Strategy

ScenarioTotal Produce Rate (msgs/sec)Total Consume Rate (msgs/sec)Median (p50)Averagep90p99
1Q, 1P, 1C160,000160,0004.55.15.58.8
1Q, 1P, 5C110,000550,0003.73.84.17.2
1Q, 1P, 10C20,000200,0002.83.03.65.4

Benchmarking in Friendly Setup

We ran our BlazingMQ benchmarks in a friendlier setup which was different from the above setup in these ways:

  • Three nodes in the BlazingMQ cluster

  • Nodes geographically spread across such that ping latency between them was less than 30 microseconds.

  • Primary node waited for acknowledgement from one replica before replying to the producer, thereby ensuring that at least two nodes (primary and one replica) had recorded the message before replying to the producer.

Priority Routing

ScenarioTotal Produce Rate (msgs/sec)Total Consume Rate (msgs/sec)Median (p50)Averagep90p99
1Q, 1P, 1C60,00060,0001.51.72.14.9
10Q, 10P, 10C120,000120,0001.12.12.735.7
50Q, 50P, 50C100,000100,0000.71.51.123.1
100Q, 100P, 100C100,000100,0000.92.52.743.8

Fan-out Routing Strategy

ScenarioTotal Produce Rate (msgs/sec)Total Consume Rate (msgs/sec)Median (p50)Averagep90p99
1Q, 1P, 5C20,000100,0000.60.70.83.0
1Q, 1P, 5C30,000150,0001.42.45.215.1

Broadcast Routing Strategy

ScenarioTotal Produce Rate (msgs/sec)Total Consume Rate (msgs/sec)Median (p50)Averagep90p99
1Q, 1P, 1C160,000160,0002.02.02.32.9
1Q, 1P, 5C110,000550,0001.71.81.94.0
1Q, 1P, 10C20,000200,0000.60.60.91.2