Understand, Setup and Test a Redis Cluster in less than 10 minutes

Redis Cluster

You already know how to setup a Redis Server in a few seconds following my advices in this post, now it’s time to think big and learn how to scale a Redis Server. Let’s start with a quote from the documentation.

Redis is, mostly, a single-threaded server from the POV of commands execution (actually modern versions of Redis use threads for different things). It is not designed to benefit from multiple CPU cores. People are supposed to launch several Redis instances to scale out on several cores if needed.

So in a Redis Server, all the data is stored in the memory space of the process itself and managed by a single core. To take advantage of multiple cores (in the same machine or other machines), the dataset must be divided in multiple pieces, where every piece is managed by the processing capacity of a core. And this is possible with Redis Cluster. This page explains what a Redis Cluster is, and how to administer it. In a nutshell, a few concepts to summarize it.

Redis Cluster in a Nutshell

  1. A Redis instance in a Redis Cluster is a master shard (in red, in the picture), and a Redis Cluster is composed of two or more master shards. An odd number of master shards is recommended so a majority of shards is always available in presence of a single failure. Decisions are taken by the majority of master shards, and there is no such thing as an arbitrator.
  2. It is possible to configure one or more replica shards (in grey, in the picture), each replicating from the corresponding master shard. A replica shard can be promoted to be a master shard if the related master shard is unavailable.
  3. In a Redis Cluster, the data space is partitioned in 16384 hash slots (16384 is hardcoded), where a hash slot is a logical container storing key/value pairs.
  4. A single shard will store all the 16384 hash slots, storing all the key/value pairs, and adding a new shard to a Redis Cluster means starting a new Redis Server, adding it to the cluster, and redistributing the hash slots across the shards. This operation is called resharding.
  5. Clients connect to an instance of the cluster and learn about the mapping slots-shards-IP. Using a hash function, a client that needs to operate with a key, knows what slot stores the key, and using the mapping they connect to the right shard to operate with it.

Start a Redis Cluster

In order to start a Redis Cluster, download and compile it as usual, and execute the helper script create-cluster to start instances and initialize a 3 master shards plus three replica shards Redis Cluster.

There are several ways to install Redis, but here I choose the source code package because it comes with the create-cluster script. If you have Redis running, you can just download the tar.gz and extract the script.

wget https://download.redis.io/releases/redis-6.2.6.tar.gz
tar xzf redis-6.2.6.tar.gz
cd redis-6.2.6
make -j4
cd utils/create-cluster
./create-cluster start
./create-cluster create -f

In particular, this is the log produced by the last two commands:

bash-3.2$ ./create-cluster start
Starting 30001
Starting 30002
Starting 30003
Starting 30004
Starting 30005
Starting 30006
bash-3.2$ ./create-cluster create -f
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 127.0.0.1:30005 to 127.0.0.1:30001
Adding replica 127.0.0.1:30006 to 127.0.0.1:30002
Adding replica 127.0.0.1:30004 to 127.0.0.1:30003
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
M: 685a75d2e26062231d4d8500747be44c64e8de9c 127.0.0.1:30001
   slots:[0-5460] (5461 slots) master
M: 42bfb59288e424124ee6942762dbb9047f93729c 127.0.0.1:30002
   slots:[5461-10922] (5462 slots) master
M: d7784261adb1df483605ccda5a07a876d4c18ebe 127.0.0.1:30003
   slots:[10923-16383] (5461 slots) master
S: b09ead29ce667b3be129af893b11617c65f31cda 127.0.0.1:30004
   replicates 685a75d2e26062231d4d8500747be44c64e8de9c
S: 8f9a0d90cf21ba35420ea7c0b7cc4317ad66c9f5 127.0.0.1:30005
   replicates 42bfb59288e424124ee6942762dbb9047f93729c
S: 3f6a47da15503bac7abb31f70a02c324df8d2248 127.0.0.1:30006
   replicates d7784261adb1df483605ccda5a07a876d4c18ebe
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
.
>>> Performing Cluster Check (using node 127.0.0.1:30001)
M: 685a75d2e26062231d4d8500747be44c64e8de9c 127.0.0.1:30001
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: 42bfb59288e424124ee6942762dbb9047f93729c 127.0.0.1:30002
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 3f6a47da15503bac7abb31f70a02c324df8d2248 127.0.0.1:30006
   slots: (0 slots) slave
   replicates d7784261adb1df483605ccda5a07a876d4c18ebe
M: d7784261adb1df483605ccda5a07a876d4c18ebe 127.0.0.1:30003
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: b09ead29ce667b3be129af893b11617c65f31cda 127.0.0.1:30004
   slots: (0 slots) slave
   replicates 685a75d2e26062231d4d8500747be44c64e8de9c
S: 8f9a0d90cf21ba35420ea7c0b7cc4317ad66c9f5 127.0.0.1:30005
   slots: (0 slots) slave
   replicates 42bfb59288e424124ee6942762dbb9047f93729c
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

The following things have happened:

  1. The script creates 6 shards: 3 master shards and their corresponding 3 replica shards. All these shards run locally but on different ports.
  2. The 16384 hash slots are distributed across the 3 master shards (and their replicas, storing a whole dataset copy). Remember that a hash slot is a logical thing, it’s more a concept rather than a physical entity.
  3. The [WARNING] Some slaves are in the same host as their master comes to say that running a master shard and a replica shard in the same host is definitely a bad idea: you lose the host, and you lose a shard, so a subset of your slots won’t be available, and then an amount of your data can’t be available.
  4. The scripts inform of the mappings slots-shard-IP.

Test a Redis Cluster

Connecting to an instance and learning about the topology of the cluster is easier to do than to explain: CLUSTER SLOTS will report the master shard IPs, the replica shards IPs, and the slots covered. Let’s connect to an arbitrary instance of the cluster.

bash-3.2$ redis-cli -p 30001
127.0.0.1:30001> CLUSTER SLOTS
1) 1) (integer) 0
   2) (integer) 5460
   3) 1) "127.0.0.1"
      2) (integer) 30001
      3) "685a75d2e26062231d4d8500747be44c64e8de9c"
   4) 1) "127.0.0.1"
      2) (integer) 30004
      3) "b09ead29ce667b3be129af893b11617c65f31cda"
2) 1) (integer) 5461
   2) (integer) 10922
   3) 1) "127.0.0.1"
      2) (integer) 30002
      3) "42bfb59288e424124ee6942762dbb9047f93729c"
   4) 1) "127.0.0.1"
      2) (integer) 30005
      3) "8f9a0d90cf21ba35420ea7c0b7cc4317ad66c9f5"
3) 1) (integer) 10923
   2) (integer) 16383
   3) 1) "127.0.0.1"
      2) (integer) 30003
      3) "d7784261adb1df483605ccda5a07a876d4c18ebe"
   4) 1) "127.0.0.1"
      2) (integer) 30006
      3) "3f6a47da15503bac7abb31f70a02c324df8d2248"

Now let’s store a key, and discover what hash slots it is associated to:

127.0.0.1:30001> SET hello world
OK
127.0.0.1:30001> GET hello
"world"
127.0.0.1:30001> CLUSTER KEYSLOT hello
(integer) 866

So far so good, hash slot 866 is in the first shard (storing the keys in the slots from 0 to 5460), and I connected to the right shard. So I should also be able to connect to the corresponding replica shard and read that value, let’s try it: from the CLUSTER SLOTS output the replica is on port 30004.

bash-3.2$ redis-cli -p 30004
127.0.0.1:30004> GET hello
(error) MOVED 866 127.0.0.1:30001

Oh! MOVED informs me that I need to operate with the master shard at 127.0.0.1:30001, and that is fair, I really do not want to make changes to data in a replica. But in this case I just want to read a replica, for example to scale my reads, so I will use the READONLY command before. Read more here.

127.0.0.1:30004> READONLY
OK
127.0.0.1:30004> GET hello
"world"

Perfect. What if I connect to the wrong master shard, instead?

bash-3.2$ redis-cli -p 30002
127.0.0.1:30002> GET hello
(error) MOVED 866 127.0.0.1:30001

Good, I am informed of the location of the master shard for this key. A smart client would have known what shard to connect to (remember that a Redis client that supports Redis Cluster applies the hashing function on the key, kind of the same thing that CLUSTER KEYSLOT does, to learn what slot and therefore what shard stores the key before operating with it).

Find a few Redis Cluster enabled clients here

Redis Enterprise Cluster

At this point you may have heard about Redis Enterprise Cluster and you may wonder if it is an extended version of the Redis Cluster I have just explained… and the answer is: yes! The Redis Cluster I have explained here is the Redis open source version of the cluster, fully based on Redis open source code, AKA Redis OSS Cluster. The Redis Enterprise Cluster is a different concept and architecture, inheriting features from the Redis OSS Cluster (and with data structures and commands that are fully compatible) but encapsulating features and concepts in a fully fledged architecture. An introduction is here.

Summary

In this post you have learned a few concepts:

  • How running Redis as a standalone instance does not scale to multiple cores
  • How to scale to multiple cores with Redis Cluster
  • Data distribution in the cluster
  • The steps to setup a simple local cluster
  • And how to connect to the cluster and understand the topology

There is a lot of other things that you may want to test in this cluster, such as failover, replica migration, resharding, limitations when operating with MULTI/EXEC and so on. I will probably cover these topics in the future, but for the time being, I invite you to read this tutorial and the specifications.

Watch this video tutorial to have these concepts summarized!

Redis Cluster tutorial on YouTube

Redis is a registered trademark of Redis Ltd. Any rights therein are reserved to Redis Ltd. Any use by mortensi is for referential purposes only and does not indicate any sponsorship, endorsement or affiliation between Redis and mortensi.

Leave A Comment