Benchmarking Redis with SystemsLab

In this tutorial we'll write a somewhat straightforward benchmark of redis using SystemsLab. This tutorial is not designed to show you how to benchmark redis well, and at the end we will go over the flaws in the benchmark setup. Rather, it is designed to introduce you to the concepts you will need to write your own experiments using SystemsLab.

Prerequisites

In order to actually do this example you will need:

The SystemsLab CLI installed locally
A SystemsLab server that you can use
Two agents registered with the server and available to use for running tests
- One should have redis installed, but not running as we'll do that as part of the test.
- The other should have rpc-perf installed so we can use it as the loadgen.

Throughout this example, I will assume that the redis agent has a redis tag on it and that the rpc-perf host has a loadgen tag. Make sure you get this right, or that you use the appropriate tags for your setup, as otherwise it'll look like your experiment is queued forever.

Getting Started

For this example, we're going to start off by writing a fairly straightforward benchmark of a redis instance. We are going to start a redis instance on one machine and then start a loadgen on another machine that sends traffic to the redis instance. The loadgen will record some metrics and SystemsLab will record a set of common metrics on all the involved machines, which we can look at once the experiment has completed.

Before we start running anything, there's a small amount of setup you will want to do to make things a little easier. The systemslab CLI allows you to put create a config file that tells it where to find the systemslab server to talk to, so you'll want to create a .config/systemslab.toml file in your current directory (or any parent directory) with the following contents, replacing <url> with the appropriate URL:

toml

systemslab_url = "<url>"

INFO

If you would rather not create a config file, you can pass the URL directly to any systemsalab command by adding --systemslab-url <url> to the command arguments.

Now that we have that in place, lets quickly note down some terminology. In SystemsLab, the thing you will be creating is an experiment. An experiment is roughly meant to correspond to a single test case or benchmark. Within the experiment, there are one or more jobs, which are a sequence of commands that get executed on a single host.

Here, we are going to write a single experiment, but it will have two jobs: the redis cache server, and the loadgen machine.

Writing the Experiment Specification

In order to run an experiment in SystemsLab you need to write an experiment specification. This is usually a jsonnet file (though you can use JSON or YAML if you want) that tells the server what needs to be done to run your experiment.

For completeness, the whole experiment specification we are going to write is here. I will walk through the individual parts in the rest of this section. Afterwards, I will be assuming that you've saved this as experiment.jsonnet.

jsonnet

local systemslab = import 'systemslab.libsonnet';

local rpc_perf_config = importstr 'rpc-perf.toml';
local redis_config = importstr 'redis.conf';

{
  name: 'rpc-perf redis example benchmark',
  jobs: {
    // The rpc-perf loadgen job
    rpc_perf: {
      host: {
        tags: ['loadgen']
      },
      steps: [
        systemslab.write_file('loadgen.toml', rpc_perf_config),
        
        // Wait for the cache to start
        systemslab.barrier('redis-start'),

        // Run the loadgen
        systemslab.bash('rpc-perf loadgen.toml'),

        // Indicate to the redis job that we're done and it can exit
        systemslab.barrier('redis-finish'),

        // Upload the rpc-perf output json file
        systemslab.upload_artifact('output.json')
      ]
    },

    // The redis job
    redis: {
      host: [
        tags: ['redis'],

        steps: [
          systemslab.write_file('redis.conf', redis_config),

          systemslab.bash(
            'redis-server redis.conf',
            background = true
          ),

          // Give the redis instance a second to start up
          systemslab.bash('sleep 1'),

          // Let the rpc-perf instance know that we have started
          systemslab.barrier('redis-start'),

          // Wait for the rpc-perf instance to finish
          systemslab.barrier('redis-finish'),
        ]
      ]
    }
  }
}

Alright, so let's break this config down and see what each of the individual parts do.

INFO

If you're not familiar with jsonnet you will want to read through the jsonnet tutorial, which will walk you through the jsonnet syntax and show you the JSON it gets evaluated to.

You can also mostly think of it as JSON with nicer syntax and functions, but the tutorial is likely a better place to start.

Setup

There are some bits of the experiment specification that are common to any experiment you'll write with systemslab, let's write those out first.

The first thing we do is stick this import at the top:

jsonnet

local systemslab = import 'systemslab.libsonnet';

The systemslab CLI includes some helper functions that make it easier to declare some common step definitions. If you want to see what's available you can find the file under /usr/share/systemslab, if installed via APT/RPM or under <homebrew root>/share/systemslab if installed via homebrew/linuxbrew.

And now lets declare the skeleton of a systemslab job:

jsonnet

{
  name: 'rpc-perf redis example test',
  jobs: {}
}

This gives the experiment a name. Now we can start writing out our jobs!

Redis

To start off we are going to use the following redis config. It will not be explained as part of this guide, but if you want to actually benchmark redis then you'll want to understand what the config file does (and also check out the notes at the end of this guide on how to properly split warmup and benchmarking).

So, without further ado, we'll save the following as redis.conf in the same directory as the experiment.jsonnet we are writing.

txt

bind * -::*         # listen on all available interfaces
port 6397           # listen on port 6397

protected-mode no   # allow anyone to connect with no password (note that THIS IS VERY INSECURE!)
tcp-backlog 511     # allow up to 511 connections to queue up before dropping them
timeout 0           # do not close connections when idle
tcp-keepalive 300   # send a TCP keepalive after 300s of the connection being idle
daemonize no
pidfile /var/run/redis_6379.pid

loglevel notice
logfile ""          # disable writing logs out to a file

databases 16
always-show-logo no # prevent logs clutter by disabling the redis logo
set-proc-title yes  # make redis modify the process title, mostly for debuggability

save ""             # disable snapshotting completely
dir ./              # set the working directory to the temporary test directory

maxclients 32768    # increase max clients so the loadgen can establish enough connections
maxmemory 4294967296 # 4GiB
maxmemory-policy allkeys-random # evict keys randomly

Next, we'll need to import the config file we just wrote, we can do this by using importstr:

jsonnet

local redis_config = import 'redis.conf';

Now, let's write out a basic job spec, which we'll put in the job field of the experiment spec:

jsonnet

redis: {
  host: {
    tags: ['redis'] // only run on the host that has redis installed
  },
  steps: [
    // Write out the redis config
    systemslab.write_file('redis.conf', redis_config),

    // Run redis for 5 minutes
    systemslab.bash('timeout redis-server redis.conf')
  ]
}

This fragment above looks like it should work. However, if you read the spec at the start you will notice that it's different. This fragment will kinda-sorta work sometimes, but is not what you want. Why is that? Let's dig in a little bit.

INFO

Synchronization in SystemsLab

A SystemsLab experiment is made up of multiple jobs. When SystemsLab starts running an experiment, it tries to start all the jobs at roughly the same time. Inevitably, however, some are going to start earlier than others and, even then, trying to synchronize different jobs using sleep calls is just not going to work.

SystemsLab provides barriers to allow you to synchronize different jobs. You can use one by putting the following in your experiment spec:

jsonnet

systemslab.barrier('<barrier name here>')

Barriers work just like they do in multithreaded programming: jobs wait at the barrier until all jobs have reached the barrier, and then they all continue on. This means that you can know that all steps before the barrier have already occurred.

The thing we are missing here for our job description is synchronization. The info box above has most likely clued you into this. We haven't written the rpc-perf job yet, but we know we want things to happen in the following order:

(redis) Start the redis server.
(loadgen) Run rpc-perf against the redis server.
(redis) Shut down the redis server.

However, to do this we need to ensure that things actually happen in this order. We will need to use barriers to synchronize between the two jobs to do this. The two barriers we are going to use here are:

redis-start - The redis server has been started and is ready to accept connections.
redis-finish - The loadgen has finished and the redis server can now be shut down.

These match up exactly with the transitions 1 -> 2 and 2 -> 3.

With that, let's rewrite our redis job spec:

jsonnet

redis: {
  host: {
    tags: ['redis'] // only run on the host that has redis installed
  },
  steps: [
    // Write out the redis config
    systemslab.write_file('redis.conf', redis_config),

    // Start the server
    systemslab.bash(
      'redis-server redis.conf',
      // This means that we don't wait for this step to complete before
      // continuing on to the next one.
      //
      // Note that it comes with some caveats, if you reach the end of
      // the job with background steps still running then the job will
      // complete and those background steps will be killed. In this
      // case, that's exactly what we want.
      background = true
    ),

    // Sleep for 1s so that the redis server has a chance to start.
    systemslab.bash('sleep 1'),

    // Tell the rpc-perf job that the server is ready to go.
    systemslab.barrier('redis-start'),

    // Now wait for the rpc-perf job to finish.
    systemslab.barrier('redis-finish'),
  ]
}

This is basically what I said we would do, however there are a few changes to accomodate the fact that we need to run the redis server and the barriers at the same time. We mark the step starting the redis server with background = true so that it runs in the background for the rest of the job. Once we reach the end of the list of steps, all remaining background tasks will be killed. In our case, that's exactly what we want.

Loadgen

We've written the redis job. Now we need to write the loadgen job. For this, we will be using rpc-perf since it is fairly easy to set it up to send traffic to redis.

The first thing we will need is, once again, a config file. We'll save the following as rpc-perf.toml in the current directory.

toml

[general]
protocol = 'resp'   # use the redis protocol
interval = 1        # print stats every second
duration = 300      # run for 5 minutes
ratelimit = 50000   # send 50k requests per second
json_output = 'output.json' # also write stats to output.json
initial_seed = '0'  # make the random numbers used here deterministic

[debug]             # logging setup
log_level = 'info'
log_backup = 'rpc-perf.log.old'
log_max_size = 536870912 # 512 MiB

[target]
endpoints = ["redis.systemslab.internal:6397"]

[clients]
threads = 13       # use 13 threads, adjust this to the size of your loadgen host
poolsize = 1000    # establish 1000 connections with redis
connect_timeout = 1000 # consider a connection to have failed if it does not complete within 1s
request_timeout = 1000 # consider a request to hae failed if it does not complete within 1s
read_buffer_size = 8192
write_buffer_size = 8192

[workload]
threads = 1
ratelimit = 1000
strict_ratelimit = true

# This configures rpc-perf to generate requests with 32B keys and 1KiB
# values, and a 9:1 read:write ratio.
[workload.keyspace]
weight = 1
klen = 32
vlen = 1024
nkeys = 50000
vkind = 'bytes'
vlen = 1024

[[workload.keyspace.commands]]
verb = 'get'
weight = 90

[[workload.keyspace.commands]]
verb = 'set'
weight = 10

One important thing to note is that the endpoint is set to redis.systemslab.internal.

INFO

Address Lookup in SystemsLab

SystemsLab provides two ways to look up the address of other hosts running jobs in the same experiment:

An environment variable <job name>_ADDR is defined for each job. For this experiment, that means that you would have LOADGEN_ADDR and REDIS_ADDR. Note that these are only defined while the job is running and not as part of jsonnet evaluation, so you will need to use sed or similar to edit files.
If the systemslab-nss package is installed then you can access other jobs via DNS names at <job name>.systemslab.internal.

Note that systemslab-nss is a Recommends dependency in the SystemsLab debian packages, so it may not be present if your system is not configured to install Recommends dependencies. In that case you will need to use the environment variables to figure out the addresses to jobs.

Like the redis config, we will need to import this so we can use it in the job spec. We can do this again with one more importstr:

jsonnet

local rpc_perf_config = importstr 'rpc-perf.toml';

Now that we have our config file we can write the our job description. We've covered most of this already so I'll just list it here:

jsonnet

rpc_perf: {
  host: {
    tags: ['loadgen']
  },
  steps: [
    systemslab.write_file('loadgen.toml', rpc_perf_config),
    
    // Wait for the cache to start
    systemslab.barrier('redis-start'),

    // Run the loadgen
    systemslab.bash('rpc-perf loadgen.toml'),

    // Indicate to the redis job that we're done and it can exit
    systemslab.barrier('redis-finish'),

    // Upload the rpc-perf output json file
    systemslab.upload_artifact('output.json')
  ]
}

There's only one thing we haven't seen here and that's this line at the end:

jsonnet

systemslab.upload_artifact('output.json')

All this does is upload that file so that it can be viewed later. It will show up in the artifacts list on the experiment page.

Submitting the Experiment

We've spent a bunch of time writing an experiment. Now let's run it! You can submit it by running

bash

systemslab submit experiment.jsonnet

This will print out a link that you can go to to see the experiment page. It will show the current state of the experiment, the job logs, and, once the experiment completes, a report that lets you look at the metrics recorded while the experiment was running.

Properly Benchmarking Redis

At the start, I mentioned that we were not going to write a good benchmark of redis. This section is going to look at bit at what we would need to do better to actually get useful numbers out of this benchmark.

Warmup: The benchmark we've written basically just starts benchmarking immediately. This means that we start off measuring a completely empty cache. Unless your production setup has some serious problems, or you are benchmarking the performance of a cold cache start, this is not likely to be representative of normal cache workloads. To do this properly you will likely want to load some data into the cache first.
This is ultimately the main issue with this benchmark, and what makes it not a good redis benchmark.
Workload: A pure get/set workload may not be representative of a typical redis workload. The rpc-perf config here produces a really heavy read skew which may also not represent your use case, though heavy read skew is not uncommon for cache workloads.

Beyond that, there are some other things that you might want to investigate:

Varying Parameters: The benchmark here fixes a whole bunch of parameters. You may want to vary request rate, number of connections, the sizes of keys and values, or other parameters to see how that changes server performance.
Thread Pinning: Pinning redis to specific threads can result in different performance characteristics.

A more "production grade" redis benchmark can be see in the [sytemslab-examples] repository.

Benchmarking Redis with SystemsLab ​

Prerequisites ​

Getting Started ​

Writing the Experiment Specification ​

Setup ​

Redis ​

Synchronization in SystemsLab ​

Loadgen ​

Address Lookup in SystemsLab ​

Submitting the Experiment ​

Properly Benchmarking Redis ​

Benchmarking Redis with SystemsLab

Prerequisites

Getting Started

Writing the Experiment Specification

Setup

Redis

Synchronization in SystemsLab

Loadgen

Address Lookup in SystemsLab

Submitting the Experiment

Properly Benchmarking Redis