27 February 2009

FsChecking dnAnalytics Part 2

In part 1, I started out with defining some FsCheck properties on dnAnalytics' Complex class. The properties I showed were fairly straightforward: break the complex number in its real and imaginary part, apply the definition of the operation and compare that to the result of the actual operation under test.

A possible critique on this kind of property is that it partly duplicates the implementation of the operation under test. In my experience, it almost never does completely: there is of course some overlap, but the property is invariably simpler, takes into account less detail, and is easier to understand. The property is mostly unusable for an actual implementation, for example because it is inefficient, is not tail recursive, causes more rounding errors etc.  The only important thing about such a property is that is simple, so that you are sure it is correct (in fact, FsCheck will verify that it is correct...). So this critique is not justified.

But anyway, such definition properties are not the only kind that can be written using FsCheck. Today we'll look at properties that relate different operations (functions, methods) of the same class or module together.

As an example, we'll test that the complex numbers form a field over addition and multiplication, i.e. the field C+,*.

Now, the first thing to realize here is that instances of type Complex do not form an actual field, due to the presence of NaN, Infinity, rounding errors and overflow. So, for the purpose of this property we're about to write, we need to be able to generate Complex instances where these values are not generated.

There is a nice trick to define a custom generator that filters or otherwise manipulates values from another generator but keeps the other generator's type. First define a wrapper union type and define a generator for that:

type NoSpecials = NoSpecials of Complex

type Generators =
    static member NoSpecials() =
        { new Arbitrary<NoSpecials>() with
            override x.Arbitrary = 
                arbitrary 
                |> suchThat (fun (C (r,i) as a) -> 
                    not (Complex.IsInfinity(a)) && 
                    not (Complex.IsNaN(a))      && 
                    r > 1e-16  && r < 1e16      &&
                    i > 1e-16  && i < 1e16  )  
                |> fmapGen NoSpecials
            override x.Shrink (NoSpecials c) = shrink c |> Seq.map NoSpecials
        }

Notice how special values like NaN, and very low and high values are filtered out in the generator using the suchThat generator combinator.

Now, we can write a property with a signature like:

let prop_ComplexField (NoSpecials a,NoSpecials b,NoSpecials c) =

which will generate "non-special" complex values in a, b and c. F#'s pattern matching and FsCheck's type directed value generation come together nicely here. (The alternative for this style is to use an explicit generator using forAll or forAllShrink).

Our property should check that both addition and multiplication are commutative, associative, distributive and have an an identity. So, the following functions will be useful:

let (=~=) (expected:Complex) actual = 
    try TestHelper.TestRelativeError(expected, actual, 2e-10); true with _ -> false

let commutative f (a,b) = f a b =~= f b a 
let associative f (a,b,c) = f (f a b) c =~=  f a (f b c) //e.g. (a+b)+c = a+(b+c)
let rightdistributive f g (a,b,c) =  g (f a b) c =~= f (g a c) (g b c) //e.g. (a+b)c = ac+bc
let leftdistributive f g (a,b,c) = g c (f a b) =~= f (g c a) (g c b) // e.g. c(a+b) = ca+cb
let identity f id a = f a id = a

Armed with these, our property can be written as:

let prop_ComplexField (NoSpecials a,NoSpecials b,NoSpecials c) =
    commutative (+) (a,b)               |@ "+ commutative"      .&.
    commutative (*) (a,b)               |@ "* commutative"      .&.
    associative (+) (a,b,c)             |@ "+ associative"      .&.
    associative (*) (a,b,c)             |@ "* associative"      .&.
    rightdistributive (+) (*) (a,b,c)   |@ "right distributive" .&.
    leftdistributive (+) (*) (a,b,c)    |@ "left distributive"  .&.
    identity (+) Complex.Zero a         |@ "0 is identity of +" .&.
    identity (*) Complex.One a          |@ "1 is identity of *" 
quickCheckN "Complex numbers are a field"  prop_ComplexField  

Which produces:

Complex numbers are a field-Ok, passed 100 tests.

It should be clear that the possibilities for coming up with such properties are almost limitless, and that you'll never have to repeat any part of your implementation to express them.

Conclusion

FsCheck makes it easy to test whether relations between a set of related methods or functions hold. This technique is in my experience sometimes neglected when unit testing, because of the focus on the unit. Nevertheless, such properties can have great value in documentation and testing.

In this post I took a few shortcuts to avoid having to deal with the "specials", and so in that sense this specification could be called misleading because indeed, instances of Complex do not form a field. I did this for two reasons. First, I wanted to show the wrapper type/pattern match technique to define derived generators that generate a subset of the values of an original generator. Second, I wanted to show that FsCheck makes you work  if you want to ignore these special values - in other words, you have to make a very conscious decision to ignore them. In contrast, using unit tests the default is that you forget about these values until it's too late.

A third point I wanted to show is that with a little refactoring and common sense, you can write elegant and very readable properties that could conceivably be part of your documentation - in other words, true executable specifications.

Download fs file

23 February 2009

FsChecking dnAnalytics

I've been thinking lately what I can do to make FsCheck more widely used. Whenever I write "regular" unit tests, I feel like I'm back in the stone ages. It just feels so clumsy and tedious. Why don't more people see this? Is it because there's a learning curve? Surely that's part of it, but the benefit is so huge that this can't be the whole story. I came across this question on stackoverflow, and read a lot of misunderstandings about random testing (luckily the chosen answer was well-informed, and even mentions FsCheck). I could respond to each of those, but I'll keep that to a later post; instead, I'm resolved to convert the world to FsCheck, even if I have to do it one project at a time ;)

Today's candidate: dnAnalytics. dnAnalytics makes a good candidate for FsChecking because  it is in the mathematical field - finding properties for the functionality to satisfy should be straightforward: there's millennia of mathematical knowledge to choose from. Secondly, dnAnalytics already has some tests that I can trash later. Also, dnAnalytics  has an F# interface which I won't be using in this post, but at least the maintainers are familiar with F#, so I have some hope of "converting" them. Finally, dnAnalytics has quite a few downloads (over a 1000), so hopefully this can raise visibility of FsCheck outside the F# community.

Without further ado, let's go find bugs!  (I'll give it away now to keep you interested, as this has become a long post: I found a bug...read on.)

For my first experiment, I choose to test the Complex class, which represents a complex number a+bi, along with some operations. Fairly straightforward stuff that even a mathematically challenged person like myself can follow.

For education and amusement, I'll give an overview of how the tests I wrote revolved over time - errors and imperfections included.

The complex number generator

To test a type, typically the first step to take when using FsCheck is writing a generator for a type. In this case, we'll just be using a generator for a tuple of two floats, and map that to a complex number:

type Generators =
    static member Complex() =
        { new Arbitrary<Complex>() with
            override x.Arbitrary = two arbitrary |> fmapGen ( fun (a,b) -> new Complex(a,b))
            override x.Shrink (C (r,i)) = 
                shrink (r,i)  
                |> Seq.map (fun (r,i) -> new Complex(r,i))
        }
registerGenerators<Generators>()

Pretty easy. The shrink function also exploits the relation between a complex number and a pair of floats. In case you're wondering, I added an active pattern C to deal with the Complex class, it's just:

let (|C|) (c:Complex) = (c.Real, c.Imaginary)

The Absolute of a complex number

I started out with testing the Complex type's Absolute method. It's supposed to return the Absolute value of the Complex instance it's applied to. Here's the property I wrote:

let prop_Absolute (C (r,i) as c) = 
    let lhs = c.Absolute
    let rhs = Math.Sqrt( r*r + i*i)
    sprintf "lhs=%O, rhs=%O" lhs rhs @| (lhs = rhs)

Basically this just checks that the outcome of the Absolute method is equal to the mathematical definition of the absolute value of a complex number. The sprintf and the label operator @| are there to display the intermediate values should the property fail. And failing it does:

Absolute-Falsifiable, after 6 tests (1 shrink):
Label of failing property: lhs=NaN (Niet-een-getal), rhs=NaN (Niet-een-getal)
NaN

Classic mistake: NaN is a special case; NaN is never equal to NaN. That's easily solved:

let prop_Absolute (C (r,i) as c) = 
    let lhs = c.Absolute
    let rhs = Math.Sqrt( r*r + i*i)
    sprintf "lhs=%O, rhs=%O" lhs rhs @|
    (if Complex.IsNaN(c) then Double.IsNaN(lhs) else lhs = rhs)

produces

Absolute-Falsifiable, after 10 tests (2 shrinks):
Label of failing property: lhs=7,00446286306095, rhs=7,00446286306095
7 + 0,25i

Hmm. Instead of looking up how I could see all  of a float's significant digits, I just assumed a rounding error. I explored dnAnalytics existing tests and found just the thing to deal with that: a method to test equality taking into account a relative error. Using that method in the property results in:

let prop_Absolute (C (r,i) as c) = 
    let lhs = c.Absolute
    let rhs = Math.Sqrt( r*r + i*i)
    sprintf "lhs=%O, rhs=%O" lhs rhs @|
    (   if Complex.IsNaN(c) then Double.IsNaN(lhs) 
        else TestHelper.TestRelativeError(lhs, rhs, 2e-16);true)

And yes:

Absolute-Ok, passed 100 tests.

Notice that FsCheck works nicely with NUnit here; suppose we introduce a "bug" by adding 1 to the right hand side:

let prop_Absolute (C (r,i) as c) = 
    let lhs = c.Absolute
    let rhs = Math.Sqrt( r*r + i*i)
    sprintf "lhs=%O, rhs=%O" lhs rhs @|
    (   if Complex.IsNaN(c) then Double.IsNaN(lhs) 
        else TestHelper.TestRelativeError(lhs, rhs+1.0, 2e-16);true)

produces:

Absolute-Falsifiable, after 1 test (0 shrinks):
0 + 0i
with exception:
NUnit.Framework.AssertionException:   Expected: less than 2E-16.0d
  But was:  1.0d

   at NUnit.Framework.Assert.That(Object actual, Constraint constraint, String message, Object[] args)
   at NUnit.Framework.Assert.Less(Double arg1, Double arg2, String message, Object[] args)
   at NUnit.Framework.Assert.Less(Double arg1, Double arg2)
   at dnAnalytics.Tests.TestHelper.TestRelativeError(Double expected, Double approx, Double acceptableError) in c:\Documents and Settings\Kurt\My Documents\dnAnalytics\0.3\src\dnAnalytics.Tests\TestHelper.cs:line 53
   at Complex.prop_Absolute(Complex _arg1) in C:\Documents and Settings\Kurt\MyDocuments\dnAnalytics\0.3\src\dnAnalytics.FsCheck\Complex.fs:line 32
   at FsCheck.Property.evaluate[T,U](FastFunc`2 body, T a) in C:\Documents and Settings\Kurt\My Documents\FsCheck\FsCheck\Property.fs:line 162

But wait! Why aren't our labels displayed? We've found a bug...in FsCheck :) Hold on, I didn't con you earlier, I really did find a bug in dnAnalytics as well.

People can get a bit nervous now because they're not actually seeing what values FsCheck is generating. Let's find out:

let prop_Absolute (C (r,i) as c) = 
    let lhs = c.Absolute
    let rhs = Math.Sqrt( r*r + i*i)
    sprintf "lhs=%O, rhs=%O" lhs rhs @|
    if Complex.IsNaN(c) then Double.IsNaN(lhs) 
    else TestHelper.TestRelativeError(lhs, rhs, 2e-16);true
    |> classify (Complex.IsNaN(c)) "NaN"
    |> classify (Complex.IsInfinity(c)) "Infinity"
    |> classify (c = Complex.Zero) "Zero"
    |> classify (c = Complex.One) "One"

Absolute-Ok, passed 100 tests.
17% Infinity.
8% NaN.
2% Zero.

As you can see, using the classify combinator you can make FsCheck print out the ratio of test cases that fulfill a certain criterion. Here we learn that One is never generated; and infinity quite a bit. This is due to the fact that the built in generator for floats generates these special values with preference. We can change this behavior by changing the generator. Suppose we'd like to generate the value One also:

override x.Arbitrary = 
  frequency   [ (98,two arbitrary |> fmapGen ( fun (a,b) -> new Complex(a,b)))
              ; (2, constant Complex.One) ]

Absolute-Ok, passed 100 tests.
10% Infinity.
9% NaN.
4% Zero.
2% One.
1% Infinity, NaN.

Easy enough. Our generator now indeed generates One as well.

But hold on: we see also that a Complex number can be both Infinity and NaN. That doens't make sense. Let's write a property to check this:

let prop_NaNInfinity (c:Complex) =
    not ( Complex.IsInfinity(c) &&  Complex.IsNaN(c))
checkName  "Both NaN and Infinity" { quick with MaxTest = 1000} prop_NaNInfinity 

Note that I didn't use the usual quickCheckN function to run the tests, because the Absolute test indicated that only one test in a hundred exhibited the behavior. So I made FsCheck run this test a bit more, 1000 times to be exact. Running this sure enough produces:

Both NaN and Infinity-Falsifiable, after 418 tests (0 shrinks):
NaN

and this find was confirmed as a bug by the dnAnalytics team. A small victory for FsCheck.

The conjugate of a complex number

Let's do one more: finding the conjugate.

let prop_Conjugate (C (r,i) as c) =
    let lhs = c.Conjugate
    let rhs = new Complex(r,-i)
    sprintf "lhs=%O, rhs=%O" lhs rhs @|
    if Complex.IsNaN(c) then Complex.IsNaN(lhs) 
    else lhs = rhs

Since the Absolute property, I've become a bit wiser and factored in the possibility of NaN from the start. Running this gives:

Conjugate-Ok, passed 100 tests.

And all is well. Except one thing: our tests like a bit ugly, because we had to duplicate some code.

Red, green, refactor!

We're going to refactor two things.

First, all the classify's we've added to the Absolute property are actually common to every property where we use our Complex generator. These kinds of "tests" are not uncommon when writing a new FsCheck generator - for example, it ensures that our generator does not throw an exception when generating values (which can happen when certain objects are constructed). I've taken the habit of separating these kinds of tests into a single separate property:

let prop_ComplexGen c = 
    ()
    |> classify (Complex.IsNaN(c)) "NaN"
    |> classify (Complex.IsInfinity(c)) "Infinity"
    |> classify (c = Complex.Zero) "Zero"
    |> classify (c = Complex.One) "One"

And we leave these classify's out of the other properties. (Note that a property that returns unit or true is interpreted as succeeded by FsCheck. An exception or false indicates failure.)

Then, we add the following helper method to abstract out the labeling of left and right hand side; cleaning it up in the process:

let compare expected actual prop = 
      sprintf "expected=%O, actual=%O" expected actual @| (prop expected actual)

Now our two properties can be written:

let prop_Absolute (C (r,i) as c) = 
    compare (Math.Sqrt(r*r + i*i)) c.Absolute (fun expected actual ->
        if Complex.IsNaN(c) then Double.IsNaN(actual) 
        else TestHelper.TestRelativeError(expected, actual, 2e-16);true)
let prop_Conjugate (C (r,i) as c) =
    compare (Complex(r,-i)) c.Conjugate (fun expected actual ->
        if Complex.IsNaN(c) then Complex.IsNaN(actual) 
        else expected = actual)

A successful experiment

In my eyes, the FsCheck based tests are hugely superior to the original tests, for the following reasons.

First, we've replaced 2 x 100 hand-written tests with presumably manually calculated values in dnAnalytics with just a few lines of code.  An excerpt from the original tests:

[Test]
public void Absolute()
{
  TestHelper.TestRelativeError(ComplexMath.Absolute(new Complex(0.0, 1.19209289550780998537e-7)), 1.19209289550780998537e-7, 2e-016);
  TestHelper.TestRelativeError(ComplexMath.Absolute(new Complex(0.0, -1.19209289550780998537e-7)), 1.19209289550780998537e-7, 2e-016);
  TestHelper.TestRelativeError(ComplexMath.Absolute(new Complex(0.0, 5.0e-1)), 5.0e-1, 2e-016);
  TestHelper.TestRelativeError(ComplexMath.Absolute(new Complex(0.0, -5.0e-1)), 5.0e-1, 2e-016);

(Note that these are actually tests for a static method on ComplexMath, but Complex.Absolute calls this method directly without further ado. In any case we could easily rewrite our properties to call this method directly as well.)

These must've been a pain to write. Probably someone generated a little script to apply the definition of Absolute in each of these cases. That should be the work of a computer! Using FsCheck, it is.

Second, the original tests do not reveal the intent of the Absolute or Conjugate methods. Basically you just see a bunch of values going in, and the expected values coming out. In a normal program, you would call these "magic numbers" and call the developer that wrote them names. In unit tests, this is commonly tolerated.

FsCheck's specification on the other hand reveals the intent of the tested methods directly - in fact, I just looked up the mathematical definition of these operators to come up with the properties, and this definition is still readily apparent.

Third, FsCheck forced us to make the specification complete, and factor in NaN values. I could not find any test using NaN in the original dnAnalytics tests. This led directly to the discovery of a previously unknown bug.

In conclusion; FsCheck's tests are shorter, clearer and more complete than the original tests.

To boot, I dare say they are faster to write: I downloaded dnAnalytics, explored the code, choose a type to test, wrote the above properties, reported the bug, and typed in the bulk of this blog post in the course of about 4 hours yesterday. I spent another hour or two today cleaning up the post itself.

19 February 2009

Announcing FsCheck 0.5

Another month has gone by, another release of FsCheck is a fact. (No, I won't keep this up). Check out the goodies.

Major:

  • Shrinking, which was introduced in FsCheck 0.4, is now fully customizable per type. This concludes the integration of Neil Mitchell's shrinking code (modulo the extra bugs I introduced, obviously)
  • Combining properties using and, or, and giving a name to subproperties, similar to functionality in a port of QuickCheck to Scala: scalacheck
  • Model based testing for stateful types, another shameless scalacheck rip-off. A surprisingly small extension to FsCheck that allows you to check stateful object types.
  • Thanks to some excellent feedback from Ganesh (from the Crédit Suisse team, which has been generous with contributions; thanks!) property combinators have become more general, in particular any Lazy<property> can now be tested (as opposed to only Lazy<bool> in 0.4).
Minor
  • New property combinators: throws (expect an exception), within (expect a result within some time). The latter is from QuickCheck 2.
  • New generator combinators: listOf, constant, suchThat, suchThatMaybe, again from QuickCheck 2.
  • Pretty printing and shrinking of function values. You guessed it, ported from QuickCheck 2.
  • Added support to TypeClass.fs to define typeclasses of arrays (this needs to be special cased since an array is not an ordinary generic type. Hurray for the consistencies of language design). FsCheck now also knows how to generate and shrink arrays.
  • Added makefile for Mono users, thanks to toyvo. I've tried to keep it up to date, but have not tested it so let me know if you have any problems.
  • Various bug fixes and smaller improvements.Notably the generator for discriminated unions should now produce "bigger" values, and the float generator now generates NaN, Infinity and Epsilon fairly often.

With these changes, I can safely say that FsCheck 0.5 has reached near feature parity with the combination(!) of both QuickCheck 2.1.0.1 and Scalacheck 1.5. Not bad for half a release on a technology preview platform.

I plan to let this stabilize over the next few months, and finally adding those FsChecks to FsCheck itself - FsCheck is fornicating priestware at the moment (*). I have however used FsCheck fairly frequently in private projects and I must say it has helped me a lot. Hopefully I'll have time to blog more about my experiences soon.

As usual, let me know if you have any feedback, always a pleasure to hear from you!

(*) i.e. it does not practice what it preaches