Intelligent Backend Routes with Rails and nginx

Introduction

A fairly common deployment involves running nginx as the first hop on an application server, which in turn routes to your backend. This blog is based on Rails as a backend, but the principle could probably be universally applied.

Common nginx configurations

The standard method of deploying the above strategy is well documented in the nginx Pitfalls and Common Mistakes guide. Naturally, it's under a GOOD section, specifically, under the "proxy everything" strategy. The code they list is:

server {
    server_name _;
    root /var/www/site;
    location / {
        try_files $uri $uri/ @proxy;
    }
    location @proxy {
        include fastcgi_params;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        fastcgi_pass unix:/tmp/phpcgi.socket;
    }
}

What this will do is check for a static asset first (in the form of a file) and then proxy it to the backend.

The immediate annoyance

What you will very quickly notice, or at least you should if you watch your logs, is the incredible annoyance of dumping an entire stack trace when a route isn't matched. Such as when an apple device goes looking for their touch icon automatically, and you don't have one setup.

ActionController::RoutingError (No route matches [GET] "/apple-touch-icon.png"):
  actionpack (4.2.5) lib/action_dispatch/middleware/debug_exceptions.rb:21:in `c
all'
  actionpack (4.2.5) lib/action_dispatch/middleware/show_exceptions.rb:30:in `ca
ll'
  railties (4.2.5) lib/rails/rack/logger.rb:38:in `call_app'
  railties (4.2.5) lib/rails/rack/logger.rb:20:in `block in call'
  activesupport (4.2.5) lib/active_support/tagged_logging.rb:68:in `block in tag
ged'
  activesupport (4.2.5) lib/active_support/tagged_logging.rb:26:in `tagged'
  activesupport (4.2.5) lib/active_support/tagged_logging.rb:68:in `tagged'
  railties (4.2.5) lib/rails/rack/logger.rb:20:in `call'
  actionpack (4.2.5) lib/action_dispatch/middleware/request_id.rb:21:in `call'
  rack (1.6.4) lib/rack/methodoverride.rb:22:in `call'
  rack (1.6.4) lib/rack/runtime.rb:18:in `call'
  activesupport (4.2.5) lib/active_support/cache/strategy/local_cache_middleware
.rb:28:in `call'
  rack (1.6.4) lib/rack/sendfile.rb:113:in `call'
  actionpack (4.2.5) lib/action_dispatch/middleware/ssl.rb:24:in `call'
  railties (4.2.5) lib/rails/engine.rb:518:in `call'
  railties (4.2.5) lib/rails/application.rb:165:in `call'
  puma (2.15.3) lib/puma/configuration.rb:79:in `call'
  puma (2.15.3) lib/puma/server.rb:541:in `handle_request'
  puma (2.15.3) lib/puma/server.rb:388:in `process_client'
  puma (2.15.3) lib/puma/server.rb:270:in `block in run'
  puma (2.15.3) lib/puma/thread_pool.rb:106:in `block in spawn_thread'

There's a direct solution to this default configuration, which is well documented at a number of easily Google'd documents.

This document appears to have the same initial feeling I had - that FATAL errors should be reserved for application crashes, not the billions of bots that hit my sites daily looking for phpmyadmin.

There is also a lot of misinformation about this situation, with a number of stackoverflow posts addressing single issues (you should go and create that file) rather than the source.

A more comprehensive solution

The existing solutions just didn't quite satisfy me. To be clear, there's nothing immediately terrible about just creating a 404 page as described, but the idea that a backend designed to service certain endpoints ends up with all unknown traffic routed to it worked strongly against the way I like to run systems.

In some cases it's easy. For my Erlvulnscan, there is a single endpoint, and I can manually code up my nginx.conf as such:

    location /netscan {
        proxy_pass http://localhost:8081;
    }

Research can dig up enterprise solutions involving embedded LUA and Redis. That's way overkill for my needs however.

Problem 1: What does a good route look like?

For my ctadvisor interface, I create this quick rake task. You can implement it yourself by adding the task file in to the lib/tasks/ directory.

The general goal here is: print out a mapping of valid endpoints for later use. It looks like this:

$ bundle exec rake nginxmap
map $uri $rails_route_list {
    default "false";
    ~^/assets "true";
    ~^/registrations/verify/ "true";
    ~^/registrations/verify "true";
    ~^/registrations/unsubscribe "true";
    ~^/registrations/destroy/ "true";
    ~^/registrations "true";
    ~^/registrations/new "true";
    ~^/rails/info/properties "true";
    ~^/rails/info/routes "true";
    ~^/rails/info "true";
    ~^/rails/mailers "true";
    ~^/rails/mailers/ "true";
    ~^/$ "true";
}

The output is somewhat like running "rake routes", but there you see routes like this:

/registrations/destroy/:id/:nonce(.:format)

Although it's possible to build complex regex's in nginx to try to be very specific, that's not the goal here. It's "good enough" to reach the goal of ensuring it's a valid endpoint by stopping at the first symbol (:id) and ensuring the path matches everything before it.

The code also has a special handler for /, because this should only match in its entirety (otherwise, everything matches).

There's a big TODO here in that this path shows a few additional routes (such as /assets) which aren't present in "rake routes". I could just regex these out, but I'd like to better see the root cause.

Problem 2: How to actually set these routes up in nginx

The obvious solution involves either a whole series of location { } blocks matching each, or one massive regex. Neither of these are particularly pretty, or scaleable.

It turns out nginx has a reasonably good alternative in the map directive.

The task we created formats our routes appropriate for use in the map directive, allowing us to configure nginx like this:


    include 'railsmap.conf';

    server {
        ...
        try_files $uri @rails;
        location @rails {
            if ($rails_route_list = "false") {
                return 404;
            }
          proxy_pass http://localhost:8082;
        }
    }

Where the railsmap.conf can be created by running:

bundle exec rake nginxmap > railsmap.conf

I re-run this every time I add a route in Rails. In practice, on an established application, this isn't highly common.

In practice

The described system has now been running on the ctadvisor page for a couple of days and I'm quite happy with the results. Obviously, your environment may be different. Or you may just care less about how specific your routing is.

A non-trivial amount of traffic hitting Rails for me comes in the form of rediculous bots. It should be clearly stated that you're not providing a significant security benefit by "firewalling" off hundreds of scans for vulnerable Wordpress plugins against a Rails server, but you are blocking unwanted traffic, which is never a bad thing.

Use protobufs - now

Introduction

If you've ever touched any form of web development, ever, you've probably used JSON to get data from a server to a client. Ajax queries nearly always pull data in this format.

Recently, Google Google invented the Protobuf standard, which promises a number of advantages. This seems to have been largly ignored by the community for a while, with most discussions degrading to a complaint one Python library's performance.

I took an interested primarily when noting that Riak KV recommends its protocol buffer interface for performance. I also note, I'm not a Python user.

Typed data

Aside from a potential performance increase, Protocol Buffers are typed. As someone who literally couldn't handle Javascript until things are rewritten in Typescript, this feature is worth a lot.

Smaller

If you're performing a 32 byte Ajax query, you probably don't care if JSON included overhead. If you're doing a much larger query, you might.

Test bed

In order to obtain a fair test, I'm comparing against two JSON libraries: JSX, which is pure Erlang, and Jiffy, which is C.

The protobuf implementation we are using is from Basho..

I'd very much like to go on the record and state, I feel in most cases, microbenchmarks should be taken with a grain of salt. Including this one. Anyone who tries to rewrite anything based just on this blog is in for a bad time. Do your own tests.

In order to use Protocol Bufers, we start by defining the types. This is the contents of my things.proto file.

I've used some Ruby as a quick demonstration of what our data structure may look like:

irb(main):002:0> something = {:counter => 1, :number => 50}
    => {:counter=>1, :number=>50}
irb(main):003:0> something.to_json
    => "{\"counter\":1,\"num\":50}"

Using this, I can create a protobuf definition. This is the below file. Straight away, you can see that I've defined not only that the variables are of the in32 type, but that there are exactly two of them, and they are required. There's an obvious advantage at this point of knowing exactly what you're receiving over the wire.

message Counternumber {
    required int32 counter = 1;
    required int32 num = 2;
}

And now here's our test bed application. It was run up in a few minutes so it's not meant to be a shining example of Erlang. If you're not familiar with Erlang or just want a tl;dr, it builds a list (an "array", if you will) of 100 of these structures, and serialises it 100000 times with to create a benchmark.

-module(data).
-compile(export_all).
-define(TIMES, 100000).

-type ourthing() :: {'counter',pos_integer()} | {'num',1..1000}.

-spec fullrun() -> 'ok'.
fullrun() ->
    X = makedata(),
    {Jiffy, _} = timer:tc(data, withjiffy, [X]),
    {JSX, _} = timer:tc(data, withjsx, [X]),
    {Props, _} = timer:tc(data, withprop, [X]),
    io:fwrite("Jiffy time: ~p, JSX time: ~p props time: ~p~n", [Jiffy, JSX, Props]),
    Proplen = byte_size(iolist_to_binary(withprop_node(X, []))),
    JSONlen = byte_size(jsx:encode(X)),
    io:fwrite("JSON is ~p long and Protobuf is ~p long~n", [JSONlen, Proplen]).

-spec makedata() -> [ourthing()].
makedata() ->
    Y = [ [{counter, X}, {num, rand:uniform(1000) }] || X <- lists:seq(1,100)],
    lists:flatten(Y).

-spec withprop_node([ourthing()], any()) -> [any()].
withprop_node([], Acc) ->
    Acc;

withprop_node(X, Acc) ->
    [{counter, A} , {num, B} | Tail] = X,
    Encode = thing_pb:encode_counternumber({counternumber, A, B}),
    withprop_node(Tail, [Acc | Encode]).

-spec withprop([ourthing()]) -> [any()].
withprop(X) ->
    withprop(X, ?TIMES).

-spec withprop([ourthing()], non_neg_integer()) -> [any()].
withprop(X, 0) ->
    iolist_to_binary(withprop_node(X, []));

withprop(X, T) ->
    iolist_to_binary(withprop_node(X, [])),
    withprop(X, T-1).

-spec withjsx([ourthing()]) -> any().
withjsx(X) ->
    withjsx(X, ?TIMES).


-spec withjsx([ourthing()], non_neg_integer()) -> any().
withjsx(X, 0) ->
    jsx:encode(X);

withjsx(X, T) ->
    jsx:encode(X),
    withjsx(X, T-1).

-spec withjiffy([ourthing()]) -> any().
withjiffy(X) ->
    withjiffy(X, ?TIMES).

-spec withjiffy([ourthing()], non_neg_integer()) -> any().
withjiffy(X, 0) ->
    jiffy:encode({X});

withjiffy(X, T) ->
    jiffy:encode({X}),
    withjiffy(X, T-1).

Results

With that testbed run, here is the output I'm seeing:

Jiffy time: 6936403, JSX time: 25947210 props time: 5145719
JSON is 2283 long and Protobuf is 486 long

There's an obvious benefit that's immediately visible here: the Protobuf output is less than a quarter of the size of the JSON.

To help review the timeframes, I've reformatted them as below. Elapsed time is presented in microseconds.

Implementation Time
Jiffy 6,936,403
JSX 25,947,210
Protobuf 5,145,719

In a world where performance counts, these differences are non-trivial. It's hard to argue about the benefits here.

Downsides

There are of course downsides. Working with protobufs is obviously more work, and they'll have to be converted on the client side. I'll suggest a "development mode" that still uses JSON, so you can use the network monitor usefully when you need it.

In an upcoming blog, I'll be converting the erlvulnscan frontend to read protobuf AJAX queries.

Argon2 code audits - part one - Infer

Introduction

This article is the first part in a series in which we use popular tools to audit the Argon2 library.

Let's start with a quick background on what Argon2 is with a quote from their README:

This is the reference C implementation of Argon2, the password-hashing function that won the Password Hashing Competition (PHC).

Argon2 is a password-hashing function that summarizes the state of the art in the design of memory-hard functions and can be used to hash passwords for credential storage, key derivation, or other applications.

More information at the official Argon2 Github

In today's article, we review with a static code analysis tool. Such tools are often seen in a negative light, and hopefully the findings of this article can increase the use of such tools.

Infer

Infer is a static analysis tool for C and Java that was opened source by Facebook. See the official Infer website here

I had used Infer early in its release, but it was quite frustrating to keep running. Every time I upgraded clang, or glibc, or just about anything, it seemed to break. As an Arch Linux user, that was regularly.

There's a great solution to this problem in modern times - Docker. I checked and it seemed Facebook had the same idea, as now they publish a Dockerfile. It actually didn't work when I first tried it, but my issue was attended to pretty quickly.

With a working file presented, I aren't too interested in Android development, so I created a slimmed down Dockerfile without the Android SDK. You can see this here:

# Base image
FROM debian:stable

MAINTAINER Infer

Debian config

RUN apt-get update && \ apt-get install -y --no-install-recommends \ build-essential \ curl \ git \ groff \ libgmp-dev \ libmpc-dev \ libmpfr-dev \ m4 \ ocaml \ default-jdk \ python-software-properties \ rsync \ software-properties-common \ unzip \ zlib1g-dev

Install OPAM

RUN curl -sL \ https://github.com/ocaml/opam/releases/download/1.2.2/opam-1.2.2-x86_64-Li nux \ -o /usr/local/bin/opam && \ chmod 755 /usr/local/bin/opam RUN opam init -y --comp=4.02.3 && \ opam install -y extlib.1.5.4 atdgen.1.6.0 javalib.2.3.1 sawja.1.5.1

Download the latest Infer release

RUN INFERVERSION=$(curl -s https://api.github.com/repos/facebook/infer/releases \ | grep -e '^[ ]+"tagname"' \ | head -1 \ | cut -d '"' -f 4); \ cd /opt && \ curl -sL \ https://github.com/facebook/infer/releases/download/${INFERVERSION}/infer-linux64-${INFERVERSION}.tar.xz | \ tar xJ && \ rm -f /infer && \ ln -s ${PWD}/infer-linux64-$INFER_VERSION /infer

Compile Infer

RUN cd /infer && \ eval $(opam config env) && \ ./configure && \ make -C infer clang

Install Infer

ENV INFERHOME /infer/infer ENV PATH ${INFERHOME}/bin:${PATH}

Building using this file basically consists of:

  • Place Dockerfile in an empty directory
  • Run: docker build -t infer:0.1 .

With the container built, you can bring up an Infer container and destroy it safely any time you need to test some code.

Running it

A docker container with a copy of Infer isn't that useful without a copy of your codebase. Fortunately, I happen to have a cloned git repo in my home directory. We can start the container and mount this code inside the container as follows:

$ docker run -t -v /path/to/phc-winner-argon2/:/code --rm -i infer:0.1

This will bring up a Docker container, in a way that's quite different how you hear about Docker being used in devops scenarios. Specifically, it'll bring you into an interactive shell, and when you run "exit" it will destroy the container.

The first thing we'll want to do is cd to the /code directory, from which we can start running the infer analyzer (conveniently in our PATH) against the codebase.

$ infer -- clang -c  -Wall -g -Iinclude -Isrc  -pthread src/run.c
Starting analysis (Infer version v0.6.0)
Computing dependencies... 100%
Creating clusters... 100%
Analyzing 1 clusters.Analysis finished in 0.257342s
Analyzed 4 procedures in 1 file
No issues found

What you'll see there is, the run file analyzed, and no real output to talk about. We should work through each file in this fashion. It turns out core.c is the interesting one.

$ infer -- clang -c  -Wall -g -Iinclude -Isrc  -pthread src/core.c
Starting analysis (Infer version v0.6.0)
Computing dependencies... 100%
Creating clusters... 100%
Analyzing 1 clusters.Analysis finished in 0.777034s
Analyzed 17 procedures in 1 file
Found 4 issues
src/core.c:286: error: MEMORY_LEAK
   memory dynamically allocated to thr_data by call to calloc() at line 267, column 16 is not reachable after line 286, column 25
  284.                       rc = argon2_thread_join(thread[l - instance->threads]);
  285.                       if (rc) {
  286. >                         return ARGON2_THREAD_FAIL;
  287.                       }
  288.                   }

src/core.c:286: error: MEMORY_LEAK
   memory dynamically allocated to thread by call to calloc() at line 262, column 14 is not reachable after line 286, column 25
  284.                       rc = argon2_thread_join(thread[l - instance->threads]);
  285.                       if (rc) {
  286. >                         return ARGON2_THREAD_FAIL;
  287.                       }
  288.                   }

src/core.c:302: error: MEMORY_LEAK
   memory dynamically allocated to thr_data by call to calloc() at line 267, column 16 is not reachable after line 302, column 21
  300.                                             (void *)&thr_data[l]);
  301.                   if (rc) {
  302. >                     return ARGON2_THREAD_FAIL;
  303.                   }
  304.

src/core.c:302: error: MEMORY_LEAK
   memory dynamically allocated to thread by call to calloc() at line 262, column 14 is not reachable after line 302, column 21
  300.                                             (void *)&thr_data[l]);
  301.                   if (rc) {
  302. >                     return ARGON2_THREAD_FAIL;
  303.                   }
  304.

A quick review of this codebase, with the highly descriptive output above should let you quickly ascertain that, yes, these are genuine issues, and fairly easy to fix.

This became a PR:

Pull request fixing this issue

Conclusion

Hopefully what this demonstrate is that, once the appropriate container is handy, running Infer is something that can be done in minutes. Of course, in a larger scale project, it wouldn't be hard to script the execution, as opposed to running manually for each file.

The practical output here is precisely zero false positives, and four genuine memory leaks. I encourage more developers to look into such solutions. Obviously, a huge amount of credit goes to Facebook for releasing this tool.

The interesting thing here is that I had previously run this codebase through Valgrind - but what that misses is that it will only detect leaks that actually get triggered during the execution.

In our next part, we implement an afl-fuzz harness!