Property based tests, contracts with Ruby

Base App

For this demonstration, we are going to be using the venerable Fizzbuzz application. For those who haven't seen it, it's a common programming koan - see here the Wikipedia page.

Despite being a very single function, it can be surprising the issues you pick up.

With thanks to @Kerrick on Github, I've taken the first example code found on Google. Here's our sample file fb.rb:

#!/usr/bin/env ruby

def fizz_buzz(max)
  arr = []
  (1..max).each do |n|
    if ((n % 3 == 0) && (n % 5 == 0))
      arr << "FizzBuzz"
    elsif (n % 3 == 0)
      arr << "Fizz"
    elsif (n % 5 == 0)
      arr << "Buzz"
    else
      arr << n
    end
  end
  return arr
end

For a quick demonstration, let's see how it looks:

2.2.2 :001 > require_relative 'fb'
 => true
2.2.2 :004 > fizz_buzz(5)
 => [1, 2, "Fizz", 4, "Buzz"]

Contracts

So far so good. The first thing I'm going to do is setup contracts. Let's create this Gemfile:

source 'http://rubygems.org'
gem 'contracts'

And install the gem (locally for this app, keeping the global space clean):

bundle install --path=vendor/bundle

Using contracts involves placing this at the start of your script:

require 'contracts'
include Contracts

And then we need to think about our function. In this case, the input parameter is a positive integer, and it returns an array of strings. So I placed this directly before the function definition:

Contract Pos => ArrayOf[String]

Now let's try and run it. It sounds simple and should "just work", but let's see:

2.2.0 :003 > fizz_buzz 5
ReturnContractError: Contract violation for return value:
    Expected: (a collection Array of String),
    Actual: [1, 2, "Fizz", 4, "Buzz"]
    Value guarded in: Object::fizz_buzz
    With Contract: Pos => CollectionOf
    At: /home/technion/fizzbuzz_tests/fb.rb:7

Turns out, the current code doesn't return an array of strings, it mixes integers with strings. I can hear it already. "But my code works fine". Really? Let's go back to the pre-contract code and try something:

2.2.0 :007 > puts "The third Fizzbuzz output is " + fb[2]
The third Fizzbuzz output is Fizz

Sounds legit..

2.2.0 :008 > puts "The fourth Fizzbuzz output is " + fb[3]
TypeError: no implicit conversion of Fixnum into String

Purists will point out that string interpolation would have resolved this, but that's not the point. The point is seeing unexpected behaviour because the return type differents from what was expected. With that in mind, let's put our contract in place, and alter the final branch of our statement on line 17 accordingly:

arr << n.to_s

Running it in irb:

2.2.0 :002 > fizz_buzz 5
=> ["1", "2", "Fizz", "4", "Buzz"]

Much better.

Some basic tests

Before we do any new, exciting tests, let's get some basic ones in place. This is a boilerplate Rakefile for minitest:

require 'rake'
require 'rake/testtask'

Rake::TestTask.new do |t|
  t.test_files = Dir.glob('spec/*.rb')
end
task(default: :test)

The two test applications were added to our Gemfile. We'll be using minitest, and we'll come back to explaining rubycheck.

gem 'rubycheck'
gem 'minitest'

Re-run bundler as above to install these gems. We also created spec/fbtests.rb. Rather than walk you through each individual test, we've annotated them in comments.

#!/usr/bin/env ruby

require 'minitest/autorun'
require 'rubycheck'
require_relative '../fb'

#Boilerplate
class FBTest < MiniTest::Test
  #The most basic test is a matter of identifying a simple input and 
  #confirming that a simple output matches exactly.
  #A small number like 5 can be fully typed out
  def test_5
    fb = fizz_buzz 5
    assert_equal ["1", "2", "Fizz", "4", "Buzz"], fb
  end
  #A larger fizzbuzz test needs to be considered more methodically. Noone
  #Will sit there typing out the expected results for fizz_buzz 100.
  def test_100
    fb = fizz_buzz 100
    #One thing we can say about fizzbuzz 100 is the length. Check it
    assert_equal 100, fb.length
    #This test verifies every element in the array matches one of the valid
    #results. This is a great way of checking every single value in some way.
    assert fb.all? { |e| /(\d+)|(FizzBuzz)|(Fizz)|(Buzz)/.match(e) }
  end
  def test_negative
    #We said earlier our contract shouldn't allow this. Check for an exception.
    assert_raises {fizzbuzz -1 }
  end

And that's a simple guide to writing tests. We recommend running them:

bundle exec rake test

But that's where a lot of guides would stop.

Property based testing

One of the great things about the fizzbuzz 100 test we wrote is that it's fairly generic. It should work for fizzbuzz 10, or fizzbuzz 1000 in the same way. So why not write a test that tests this property?

As a first example, we'll write simple a test that checks against one random number. Add in this test:

def test_random
    r = RubyCheck.gen_uint
    fb = fizz_buzz r
    assert fb.all? { |e| /(\d+)|(FizzBuzz)|(Fizz)|(Buzz)/.match(e) }
  end

All we've done here is made '100' into a random variable 'r'. The output however is interesting:

]$ bundle exec rake test
Run options: --seed 11306

# Running:

...rake aborted!
SignalException: SIGKILL

It'll take you a while to track down that segfault, and when you do, you'll see a huge dump sitting in the server logs, ending in this:

kernel: Out of memory: Kill process 2994 (ruby) score 896 or sacrifice child
kernel: Killed process 2994 (ruby) total-vm:2610792kB, anon-rss:1895804kB, file-rss:2024kB

What you are looking at is the fact that a huge, random number is able to crash our fizz_buzz application. We're just lucky the OOM killer killed the right app. Win one, for property based testing.

To pick a, somewhat arbitrary, upper bound, I've placed this in the first line of our updated fizzbuzz function:

fail if max > 65536

And then we baked in a test for it:

def test_too_high
    assert_raises { fizz_buzz 65538 }
  end

If you comment out the random_test for a moment, you should be able to run a successful:

bundle exec rake test

So what to do about getting the random test running again? Well this sort of thing should work:

r = RubyCheck.gen_uint % 65537

However, I really feel property based testing should have a "property" for a 16 bit integer, so I've submitted a PR to rubycheck. If it gets through, this will be equivalent:

 r = RubyCheck.gen_uint16

Whichever you use, you should not find yourself able to check a random number with your fizzbuzz application.

You can probably see where I'm going with this - if you can test one random number, why not test many? rubycheck does have a "for_all" function, however, for various reasons, I prefer to implement this myself. Let's run a series of numbers through the checker.

Obviously, the more the better, but any more than a few hundred makes this a very boring test to sit through. So, I will be implementing some general tests, then more tests for the upper and lower bounds.

def test_random
    200.times do
      r = RubyCheck.gen_uint16
      fb = fizz_buzz r
      assert_equal r, fb.length
      assert fb.all? { |e| /(\d+)|(FizzBuzz)|(Fizz)|(Buzz)/.match(e) }
    end
  end
  def test_low_random
    100.times do
      r = RubyCheck.gen_uint16%256
      fb = fizz_buzz r
      assert_equal r, fb.length
      assert fb.all? { |e| /(\d+)|(FizzBuzz)|(Fizz)|(Buzz)/.match(e) }
    end
  end
  def test_high_random
    100.times do
      r = RubyCheck.gen_uint16%256 + 65280 #2e16 - 256
      fb = fizz_buzz r
      assert_equal r, fb.length
      assert fb.all? { |e| /(\d+)|(FizzBuzz)|(Fizz)|(Buzz)/.match(e) }
    end
  end
$ bundle exec rake test
Run options: --seed 30688

# Running:

......E

Finished in 76.904164s, 0.0910 runs/s, 7.7109 assertions/s.

  1) Error:
FBTest#test_low_random:
ParamContractError: Contract violation for argument 1 of 1:
        Expected: Pos,
        Actual: 0
        Value guarded in: Object::fizz_buzz
        With Contract: Pos => CollectionOf
        At: /home/technion/fizzbuzz_tests/fb.rb:7

Yes, we've found another issue. Our contract states "positive integer" - that means it does not accept a 0. Now you've entered a philosophical discussion: is there a fizzbuzz(0) ? If you believe not, then the contract served its purpose, and we should update the tests accordingly.

In the interests of shirking this convention, I have declared that on this project, fizzbuzz (0) is in fact an empty array. To this end, here is my final fizzbuzz code:

Contract Or[Pos, 0] => ArrayOf[String]
def fizz_buzz(max)
  fail if max > 65536
  arr = []
  return arr if max == 0
  (1..max).each do |n|
    if ((n % 3 == 0) && (n % 5 == 0))
      arr << "FizzBuzz"
    elsif (n % 3 == 0)
      arr << "Fizz"
    elsif (n % 5 == 0)
      arr << "Buzz"
    else
      arr << n.to_s
    end
  end
  return arr
end

Of course, that deserves one more test:

def test_0
    assert_equal [], fizz_buzz(0)
  end

Regardless of the position you take on this, the point is that randomised testing forced a developer to at least consider an edge case, and plan accordingly. That in turn, is what we call "less bugs".

Concurrent Vulnerability scanning with Erlang

Background

Following the recent series of major vulnerabilities, a trend that's become popular has been the online scanner. It was far easier to test a service using and online Shellshock scanner, or an online Heartbleed scanner, or in this case, the MShttp.sys vulnerability. This scanner was inspired by this particular scanner.

Scanning larger amounts of machines has however, been historically quite slow. If any of the listed scanners simply iterated across a list of machines, it could take quite some time to run across reasonably large networks.

Concurrent scanning

That introduction is a perfect place to introduce the concurrency capabilities of Erlang. Concurrency has been a huge trend lately, mostly in relation to the (stupid) argument that argues Node.JS has concurrency and therefore it's the ony platform that can scale to handle a personal blog.

A much more powerful use of concurrency exists in this snippet of code, which I see myself using regularly.

Pid = spawn(fun() ->
        receive
        {From, execute} ->
            From ! {N, function(N)) }
        end 
    end),
Pid ! {self(), execute},

Which, in short, tells Erlang to run a particular function concurrently, and send the results back to the parent. In this example, I've written a scanner for CVE-2015-1635. By utilising this loop, I've found you can perform such a scan incredibly fast.

The time I've quoted is 200ms, which accounts for a certain amount of network latency, in scanning an entire /24 of network hosts. Without that, let's see the below:

$ time wget http://erlvulnscan.lolware.net:8080/?network=127.0.0.0 -O -

2015-06-12 10:15:48 (152 MB/s) - written to stdout [12593/12593]


real    0m0.062s
user    0m0.003s
sys     0m0.007s

Yes, that's less than one second to:

  • Make an HTTP connection to each server from 127.0.0.1 to 127.0.0.254
  • Run the vulnerability check
  • Format the results nicely in JSON and render to the user

That's an impressive time. I look forward to being told it should be done in JS.

Implementation

Source code for a complete implementation can be found here:

erlvulnscan on Github

A key design goal is to allow the project to easily forked and run new types of vulnerability scans, which I am likely to do at some point. A functional implementation can be seen here:

erlvulnscan demonstration

Although the project is scheduled for aesthetic improvements, the backend is now stable.

Design

erlvulnscan is an Erlang OTP application built using Cowboy and Jiffy, and managed by rebar.

Rather than play with Cowboy routes for static data, assets such as the front page are intended to be served using nginx, with a routing rule to forward the API.

Code is intended to be dialyzer and edoc friendly.

The frontend uses ReactJS, although at this stage it is largely a copy of the React tutorial. Starting using this base will allow it to be rapidly improved however.

TODO

In the coming weeks, the project should see the following. Hopefully in time for a new major issue to scan for. This roadmap exists here both to serve as my own roadmap, and avoid the inevitable situation where the only contact someone makes about my this page is point me in the direction of something like Grunt.

  • Properly modularise the scanner, so replacement of a single file can facilitate management of a new vulnerability
  • Fully dialyze and tidy up all warnings
  • Frontend aesthetic overhaul and implementation
  • Route changes, entire project should be able to be served on one port!!
  • Frontend code overhaul. Many function names are straight out of ReactJS tutorial. JSX should be converted and minified on the backend.
  • Learn and implement EUnit
  • Implement Elvis
  • Implement Grunt for the frontend
  • Hot code load implement and test

Fuzzing nginx - Hunting vulnerabilities with afl-fuzz

No 0day here

If you were looking for it, sorry. As of 48 hours of fuzzing, I've got 0 crashes.

AFL - successful fuzzing

American Fuzzy Lop has a very impressive history of finding vulnerabilities. The trophy case is gigantic. An ELI5 of the design of the product is: Give it a program a valid input file, and it will mess with that input file until using it crashes the example program. My first attempt at using it almost immediately found a crash situation in lci - Lolcode interpreter.

Unfortunately, successful use against something which is not a command line application that runs and quits is more difficult.

Compile and build

Our first step here will be to compile afl. I'm going to assume you can already do this. When building nginx, I used the following commands:

export CC=/path/afl-clang
./configure --prefix=/path/nginxinstall --with-select_module

The use of the prefix is simple - we don't want to install this as root, as a proper service, or run it as such. The select module, I'll get back to. With nginx built and installed, there are some very helpful config options:

master_process off;
daemon off;
events {
    worker_connections  1024;
use select;
multi_accept off;

}

By starting your config file like this, nginx will helpfully avoid forking to background, and start itself at a console where it belongs.

Your first server section should look like this:

server {
    listen       <ip>:8020;
    ...
}

We do this because:

* We want the parser to decide it's happy to run as non-root
* Without specifying the IP, something doesn't bind properly in our later process.

Operate with stdin/stdout

Following the suggested build gets you halfway there, but the remaining problem is that nginx wants to take input from a network port, not from stdin. Fortunately, this project exists:

Preeny on Github

Preeny almost solves our issues. I say almost because of two things:

  • Preeny intercepts accept(), but, where it exists (my system), nginx uses accept4()
  • nginx's default polling mechanism simply doesn't recognise connections that have been redirected and never triggers the event loop

For the first of these, I wrote this patch. Given accept() and accept4() are equivalent enough for our purposes, this patch just pushes accept4() to the intercepted accept().

Update: @floyd_ch points out this patch is more correct than my original one

diff --git a/src/desock.c b/src/desock.c
index 36b3db7..4b267ef 100644
--- a/src/desock.c
+++ b/src/desock.c
@@ -209,6 +209,11 @@ int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen)
        else return original_accept(sockfd, addr, addrlen);
 }

+int accept4(int sockfd, struct sockaddr *addr, socklen_t *addrlen, int flags)
+{
+      return accept(sockfd, addr, addrlen);
+}
+
 int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen)
 {
        if (preeny_socket_threads_to_front[sockfd])

Again, compile as per the Preeny instructions, I won't walk you through this.

Running it

With this in place, you can run nginx from the command line, and have it take HTTP syntax from stdin.

$ LD_PRELOAD="/home/technion/attack/preeny/Linux_x86_64/desock.so "  ./nginx
--- Emulating bind on port 8020
GET / HTTP/1.0

HTTP/1.1 200 OK
Server: nginx/1.8.0
Date: Tue, 28 Apr 2015 09:18:51 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Mon, 27 Apr 2015 08:45:32 GMT
Connection: close
ETag: "553df72c-264"
Accept-Ranges: bytes

<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
                width: 35em;
                       margin: 0 auto;
                               font-family: Tahoma, Verdana, Arial, sans-serif;
                               }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

This is successful.. almost. The problem you now see is that nginx never actually exits. To get around this, we had to patch nginx itself. Specifically, at line 262, I added this:

static int first_fd = 0;
    if (first_fd == 0)
            first_fd = max_fd;

    if(max_fd > first_fd) {
            printf("Exiting cleanly\n");
            exit(0);
    }

I'm sure there's a better place to patch, but this seemed to be the easiest for me to find. Specifically, when it knows it's been through the event loop once before and actually accepted a connection already, it'll log as such and exit.

Now, let's get a proper test case up and running. I created testcases/in.txt, based on a standard HTTP connection:

GET / HTTP/1.1
Acceptx: text/html, application/xhtml+xml, */*
Accept-Language:en-AU
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Host: lolware.net
DNT: 1
Connection: Keep-Alive
Cookie: A=regregergeg

Now let's execute it and see how that looks:

$ LD_PRELOAD="/patch/preeny/Linux_x86_64/desock.so "  ./nginx < testcases/in.txt
--- Emulating bind on port 8020
HTTP/1.1 200 OK
Server: nginx/1.8.0
Date: Tue, 28 Apr 2015 09:43:26 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Mon, 27 Apr 2015 08:45:32 GMT
Connection: keep-alive
ETag: "553df72c-264"
Accept-Ranges: bytes

<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
Exiting cleanly
$

That right there is perfect. It takes the input file from stdin, and passes it to nginx, outputs the HTML web content, then quits.

Now all that's neccessary is to run it under afl-fuzz:

$ LD_PRELOAD="/home/technion/attack/preeny/Linux_x86_64/desock.so " /home/technion/afl-1.61b/afl-fuzz -i testcases -o findings ./nginx

Now hang on, this'll run for a while.