ReactPHP

The project

One of our clients is a performance driven media company utilizing exclusive properties and platforms to link brands and consumers through targeted marketing solutions.

Two years ago they commissioned us to develop their new new project. They needed a complex solution for serving automatically text, image, video and interactive media advertisements to targeted clients. We chose to build the platform with PHP. Everything was functioning fine for some time but recently the platform gained momentum and has began generating several million events daily (impressions, clicks, conversions). These needed to be tracked properly.

The traditional way to do this proved to be ineffective. Since Apache + PHP-FPM couldn’t handle all the requests, the response time increased and some of the requests were on the edge to be rejected. And for a service of this kind this would have been devastating. We tried using several caching mechanisms, different databases, aggregation techniques, etc. However that was insufficient. We needed something more powerful and swift.

So we tried to find what needed to be improved and what software could be good enough to tackle the issue. The team has a big experience with Python and async frameworks but we wanted something in PHP. We’ve heard about ReactPHP before but haven’t got the chance to see it working so this time we decided to give it a go.

Server configuration

What we needed was a server that could spawn several ReactPHP workers to take advantage of the server’s multi core CPUs. The best performance can be achieved by PHP7 and Nginx as a load balancer.

Nginx as a Load-Balancer

It is possible to use Nginx as a very efficient HTTP load balancer which distributes traffic to several application servers and improves the performance, scalability and reliability of our ReactPHP workers.

Our goal is to proxy only those requests that don’t point to a local file. The number of ReactPHP workers should be at least equal to the number of CPU cores. Also, using UDS (Unix Domain Sockets) is preferable to using TCP/IP because with UDS we have ~50% latency reduction and almost 5X more throughput (source: https://github.com/rigtorp/ipc-bench).

By default, Nginx redefines two header fields in the proxied requests, “Host” and “Connection”, and eliminates the header fields whose values are empty strings. So since we want ReactPHP to receive host and remote address properly we should include them in the configuration.

The following configuration can be used for our react+nginx setup:

upstream reactor  {
  server unix:/tmp/reactphp/reactphp.worker1.sock fail_timeout=1;
  server unix:/tmp/reactphp/reactphp.worker2.sock fail_timeout=1;
  server unix:/tmp/reactphp/reactphp.worker3.sock fail_timeout=1;
  server unix:/tmp/reactphp/reactphp.worker4.sock fail_timeout=1;
}
server {
  ...
  real_ip_header X-Forwarded-For;
  real_ip_recursive on;
  location / {
    proxy_set_header  Host $host;
    proxy_set_header  X-Real-IP $remote_addr;
    proxy_set_header  X-Forwarded-Proto https;
    proxy_set_header  X-Forwarded $remote_addr;
    proxy_set_header  X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header  X-Forwarded-Host $remote_addr;
    if (!-f $request_filename) {
      proxy_pass http://reactor;
      break;
    }
    try_files $uri $uri/ /index.php?$query_string;
  }
  ...
}

ReactPHP as HTTP server

To start using ReactPHP as a HTTP server or in our case as a worker we can use its HTTP component and require it with composer:

composer require react/http:^0. 8.1

When we have the component installed we can create simple server.php file which will start our workers.

/** Start the project via bootstrap **/
require_once 'bootstrap.php';

$options = getopt('', array('worker:'));

if(isset($options['worker'])) {
  $number = $options['worker'];
  } else {
    die('use with --worker [number]');
  }

# Remove the previous socket file if exist
  $workerUDS = sprintf(/tmp/reactphp/reactphp.worker%s.sock, $number);
  if (file_exists($workerUDS)) {
    unlink($workerUDS);
  }

$loop = React\EventLoop\Factory::create();
  $socket = new React\Socket\UnixServer($workerUDS, $loop);
  $server->listen($socket);

$loop->run();

Reading the last line, we have:

$loop->run() – this makes our worker run inside an infinite loop (that's how long running processes work)
$socket->listen($socket) – it opens a socket by listening to a port (that's how servers work)

In order to run several workers the script expects a parameter with key worker which indicates the number of the worker. So we can have as many as we want (1, 2, 3, … 10, and so on) workers, we just need to add them to the Nginx upstream setting.

The script uses our custom bootstrap. Inside there is a piece of code for the router that will handle each HTTP request. For example we can create a simple “Ping? Pong!” request like this:

$server = new Server(function (ServerRequestInterface $request) {
  # Prepare needed parameters
  $method = $request->getMethod();
  $path = $request->getUri()->getPath();
  switch ($method) {
    # Handle GET requests
    case 'GET':
      switch ($path) {
        ### Ping? Pong! ###
        case '/ping/':
          $response = [
            'code' => 200,
            'headers' => ['Content-Type' => 'text/html']
            'body' => 'Pong!'
          ]);
        break;
      }
    break;
  }
  return new Response(
    $response['code'],
    $response['headers'],
    $response['body']
  );
});

We can run one of the workers like this:

php server.php  --worker 1

Finally we can visit the page at http://localhost/ping/, and see a message Pong!

Benchmark

The code was rewritten and optimized to work with ReactPHP and after that we made a few benchmark tests on one of our local machines – 4 cores, 4Gb RAM.

All requests simulate real requests with the functionality we needed – we have DB reading and writing, writing to a log file, Memcached requests and so on. So here are the results:

Nginx + FPM-PHP - 1000 requests with 30 connections (concurrency level)

Nginx + ReactPHP - 1000 requests with 30 connections (concurrency level)

Nginx + FPM-PHP - 2000 requests with 100 connections (concurrency level)

Nginx + ReactPHP - 2000 requests with 100 connections (concurrency level)

Nginx + ReactPHP - 2000 requests with 200 connections (concurrency level)

Nginx + ReactPHP - 5000 requests with 500 connections (concurrency level)

It’s evident from the graphs how much faster ReactPHP is. The regular PHP-FPM server started to cut the connections when I set concurrency level to 200 - 1698 requests were rejected! So 200 parallel requests at same time couldn't be handled by the local test server.

On the other hand on the same server ReactPHP handled 200 and even 500 parallel connections and processed all 2000/5000 requests without any problem!

As we can see, the ReactPHP server with nginx as a load balancer is over 10-15 times faster than old-school PHP-FPM and can handle up to 10x requests without any problem. This means, we have a dramatic performance increase with our Nginx+ReactPHP application. So this should worth a try.

Why does the performance increase?

In order to explain the reason behind this increased performance with ReactPHP we have to take a look at how the the things work usually. In a normal stack like Nginx / PHP-FPM for each HTTP request:

a HTTP server receives the request;
it starts a new PHP process, super globals like $_GET, $_POST, $_SERVER, etc are created using the data from the request;
the PHP process executes our code and returns the output;
the HTTP server uses the output to create a response and terminates the PHP process.

In this scenario, we may not worry too much about the following:

each new process starting with a fresh empty memory which is freed once it exits (so there are not memory leaks)
a process crashing won't affect other processes
static and global variables are not shared between processes
each new process starts with the new code

The traditional architecture is "shared-nothing". That means killing the PHP process once the response is sent and sharing nothing between two Requests.

The biggest disadvantage of such a setup is the low performance, because creating a PHP process for each HTTP Request means executing the following steps (bootstrap footprint):

starting a process;
starting PHP while loading configuration, starting extensions, etc;
starting the application while loading configuration, initializing services, autoloading, etc.

On the contrary with ReactPHP we keep our application always running between requests so we only execute this bootstrap once, upon starting the server - the footprint is absent from Requests.

However this comes at a price - now we're vulnerable to memory consumption, fatal error, statefulness, code update worries and keeping connections of any kind alive (like MySQL, PostgreSQL, Memcached) so we have to be very careful and handle all these issues on our own.

Going to production with ReactPHP

Now that we are ready to go with our application as an HTTP server with multiple workers we need to execute a couple of additional tasks: we have to make the code stateless and we need to restart the workers on each code update.

We created a bash script that takes care of our workers so they can be managed easily and run in the background. We can call it server.sh

#!/bin/bash

if [ "$#" -ne 2 ];then
  echo "Usage: $(basename $0) start|stop|kill|restart|status [worker-number]"
  exit 1
fi

APP_NAME=server.php
LOG_FILE=private/logs/reactphp/server.worker$2.log

pgrep -u $USER -f "$APP_NAME --worker $2$" > /dev/null #(-l will list the full command with args)
RUNNING="$?"
MY_PID=$(pgrep -u $USER -f "$APP_NAME --worker $2$")

case "$1" in
  start)
    if [ "$RUNNING" -eq 1 ]; then
      echo -e "Worker is not running! Starting it..."
      nohup php $APP_NAME --worker $2 > $LOG_FILE 2>&1 &
    else
      echo "Worker is running. Processes:"
      pgrep -u $USER -f "$APP_NAME --worker $2$" –l
    fi
    ;;
  stop)
    if [ "$RUNNING" -eq 1 ]; then
      echo -e "Worker is not running; nothing to stop!"
    else
      echo -e "Worker is running; killing it!"
      kill -15 $MY_PID
    fi
  ;;
  kill)
    if [ "$RUNNING" -eq 1 ]; then
      echo -e "Worker is not running; nothing to stop!"
    else
      echo -e "Worker is running; killing it!"
      kill $MY_PID
    fi
  ;;
  status)
  if [ "$RUNNING" -eq 1 ]; then
    echo -e "Worker is not running $2!"
  else
    echo "Worker is running. Processes:"
    pgrep -u $USER -f "$APP_NAME --worker $2$" –l
  fi
  ;;
  restart)
    if [ "$RUNNING" -eq 1 ]; then
      echo -e "Worker is not running; starting it!"
      nohup php $APP_NAME --worker $2 > $LOG_FILE 2>&1 &
    else
      echo -e "Worker is running; restarting it!"
      kill -15 $MY_PID
      nohup php $APP_NAME --worker $2 > $LOG_FILE 2>&1 &
    fi
  ;;
  *)
    echo "Usage: $(basename $0) start|stop|restart|status --worker [worker-number]"
    exit 1
esac

With this script we can effortlessly start, stop, restart, kill and get the process status of a worker. And it’s pretty easy to use:

./server.sh start 1

And our first worker is LIVE.

We use GIT to manage our development process so the easiest way to restart the workers (when we have code updates to deploy) is to execute a restart command for all the workers on post receive.

We can add the following lines to git/hooks/post-receive file:

# Restart all workers
  cd /path/to/the/project

  chmod 755 server.sh
  chmod 755 antineutrino.sh

  ./server.sh restart 1
  ./server.sh restart 2
  sleep 2
  ./server.sh restart 3
  ./server.sh restart 4

You may wonder why there is a sleep command. The answer is simple – we don’t want to reject any request which is received by Nginx while the workers are unavailable. If you have noticed in our Nginx configuration file we’ve added fail_timeout=1 to each upstream location which means if the worker is down it will be checked again in one second (by default this setting is 10 seconds which is too long in our case). So we give 2 seconds for the first half of workers to be restarted and then we restart the rest.

Another cool thing that we’ve implemented in our code is the support of killing signals thanks to the ReactPCNTL library. In our server.sh script the restart and stop commands send SIGTERM and the kill command terminates the process. We added the following lines to the bootstrap.php:

# Graceful stop
  $exit = function() use ($loop) {
    $loop->stop();
  };

# Wait for exit signal...
  $pcntl = new MKraemer\ReactPCNTL\PCNTL($loop);
  $pcntl->on(SIGTERM, $exit);
  $pcntl->on(SIGINT, $exit);

Thus none of the workers will be restarted before its current task (Request) is finished.

Regarding fatal errors and memory consumption, we can mitigate their impact using simple strategy - we restart the server once it stops. There is a plenty of software that handle this task - PHP-PM, Aerys, Supervisord and others.

Instead we decided to build our custom monitoring system again using ReactPHP!

ReactPHP Monitoring system

The monitoring system that we built does a few things:

monitors whether the workers are alive;
follows what are the workers’ current status (current memory usage, peak memory usage, finished tasks;
sends alerts in case something goes wrong;
performs some other actions - for example like restarting the workers.

So what we did was to create a simple and stable ReactPHP worker to monitor the main tracker workers. We decided to communicate with the workers via socket connections and to fetch the stats mentioned above (uptime, memory, finished tasks) every 30 seconds. Also the monitoring system is watching for the server load, memory usage and the number of the requests.

In our workers’ bootstrap we’ve added the following lines to allow socket connections from the monitoring system:

/** Terminal **/
$terminal = new React\Socket\UnixServer(‘/tmp/reactphp/reactphp.worker’. $number .’.sock’, $loop);
$terminal->on('connection', function(React\Socket\ConnectionInterface $connection) use ($loop, $startTime, &$tasks) {
  $connection->on('data', function($data) use ($connection, $startTime, &$tasks) {
    switch(trim($data)) {
      case 'levels':
        $memoryUsage = \Tools\Utils::getServerMemoryUsage();
        $connection->write(sprintf("server_memory:%s|%s|%s\n",
          $memoryUsage["free"],                         # Free
          $memoryUsage["total"] - $memoryUsage["free"], # Used
          $memoryUsage["total"]                         # Total
        ));
        $connection->write(sprintf("process_memory:%s|%s|%s|%s|%s\n",
          memory_get_usage(),          # Used
          memory_get_usage(true),      # Used real
          memory_get_peak_usage(),     # Peak
          memory_get_peak_usage(true), # Peak real
          ini_get('memory_limit')      # Memory limit
        ));
        $connection->write(sprintf("start_time:%s\n",
          $startTime
        ));
        $connection->write(sprintf("tasks:%s\n",
          $tasks
        ));
        break;
    }
  });
});

And now our workers are ready to accept connections from the monitoring system. Next let’s see how the monitoring system will connect to the workers.

We created a file called monitoring.php as well as a monitoring.sh bash file which runs the application in a background mode. In the config file we added all workers’ instances that we want to monitor.

$connector = new React\Socket\UnixConnector($loop, array(
  'timeout' => 10.0,
  'dns' => false
));

for ($idx = WORKERS_FIRST_IDX; $idx < (WORKERS_FIRST_IDX + WORKERS_TOTAL); $idx++) {
  $statistics[$idx] = $defaultValues;
  $blockUDS = sprintf('/tmp/reactphp/reactphp.worker'. $number .'.statistics.sock', $idx);

  $timer = $loop->addPeriodicTimer(MONITORING_UPDATE_TIME, function () use ($connector, $idx, $blockUDS) {
    $connector->connect(sprintf($blockUDS, $idx))->then(
      function (ConnectionInterface $connection) use ($blockUDS, $idx, &$statistics) {
        $connection->on('data', function ($data) use ($connection, $blockUDS, $idx) {
          // Collect information from a worker and store it in DB or in Cache
          // Checking its memory usage and send an alert if something's wrong
          }
          $connection->end();
        });
        # Send a "levels" command on connect
        $connection->write('levels' . PHP_EOL);

        # Report errors to STDERR
        $connection->on('error', function ($error) use ($connection) {
          $connection->end();
        });

        # Report closing and stop reading from input
        $connection->on('close', function () use ($connection) {
          unset($connection);
        });
      },
      # Failed to connect due to $error
      function (Exception $error) use ($connector) {
        // Send a Slack alert that the worker is down.
      }
    );
  });
}

We use the addPeriodicTimer method from ReactPHP EventLoop Component which is useful for invoking specific callback repeatedly after the set interval.

Once we collect the metrics we can display them on some password protected URL. As mentioned before we have two kinds of stats – Server overview and Workers overview. Here is how our tracking system displays them:

Server overview

Workers overview

Final words

ReactPHP is a powerful library indeed. You only need about 40-50 lines of code to run a simple application and you will have a superfast HTTP server. Using ReactPHP we managed to kill the expensive bootstrap of our application (one of the most time consuming parts of the process) and we drastically increased the performance of the platform.

News

Valentin Borisov | 11 May 2018

The project

Server configuration

Nginx as a Load-Balancer

ReactPHP as HTTP server

Benchmark

Why does the performance increase?

Going to production with ReactPHP

ReactPHP Monitoring system

Final words

Bulgaria

United Kingdom

office@mtr-design.com

The project

Server configuration

Nginx as a Load-Balancer

ReactPHP as HTTP server

Benchmark

Why does the performance increase?

Going to production with ReactPHP

ReactPHP Monitoring system

Final words

MTR Design Limited

Bulgaria

United Kingdom

office@mtr-design.com