10 May 2008

Parsing Dot with F#: Part 4 - Conclusion

In this very last part, I'll just show how to put everything together: the types, the lexer and the parser. First, we need to generate some F# files from the dotlexer.fsl  and the dotparser.fsy files. This is easily accomplished by running fslex and fsyacc, respectively. This creates dotlexer.fs, dotparser.fs and dotparser.fsi. If, like me, you're working in Visual Studio, you can add these files to your project, but take care that you put everything in the right order or it won't compile (I actually lost quite a bit of time with this): first the definition of the abstract syntax type, then the parser files, then the lexer file. Finally, you can call the parser:

let parseFromString(text) =
    let lexbuf = Lexing.from_string text
    try
        DotParser.start DotLexer.token lexbuf
    with e ->
        let pos = lexbuf.EndPos
        printf "error near line %d, character %d\n%s\n" pos.Line pos.Column (e.ToString())
        Graph (new List<_>())

This is fairly straightforward: the Lexing module defines functions to create a lexing buffer from a string. Feed this buffer into the DotParser.start function generated by fsyacc. This function will return the expected type, as defined in the .fsy file.

That's really all there is to it - the rest is some exception catching and printing an error for debugging purposes. One thing I'd like to warn you about is that debugging this stuff can be quite tedious, due to the fact that first the F# files are generated by the fsyacc and the fslex tools, and afterwards compiled. Although the compiler does a fairly good job of showing where the actual errors are in the fsy and fsl files, it is sometimes hard to track down what exactly is the problem.

Finally, you can download the example project.

Technorati: ,,,

09 May 2008

Parsing Dot with F#: Part 3 - The parser

The parser is produced by fsyacc (yacc stands for Yet Another Compiler Compiler). Basically it turns the tokens produced by the lexer into a nice set of types that you can easily manipulate from within your program. Fsyacc can generate a parser based on a definition that looks pretty much the same as the simplified grammar for Dot that I discussed in part 1.

Our dotparser.fsy file starts off with some preliminaries:

%{

open Ast
open System.Collections.Generic

%}

Same as the lexer, this code will eventually  be transformed into regular F# code. So here we can open some modules for use later, and potentially do other preliminaries.

Then, some parser-related startup:

%start start

%token <string> ID
%token NODE EDGE DIGRAPH GRAPH DIEDGE UNEDGE QUOTE LACC RACC LPAREN RPAREN SEMI COMMA ASSIGN EOF 

%type < Ast.Graph > start

The first and last line indicate to fsyacc where to start parsing (at which rule, to be defined later), and the  type the complete parsing process will construct; if you remember our abstract syntax tree from part 1, we're trying to make a Graph type.

The lines in between are just an enumeration of all the possible terminal tokens (i.e. tokens that needn't be parsed further), along with the type of data that they carry. You should recognize these from the lexer. In our parser, only the ID token carries information of type string.

And finally, the part that makes this all tick:

start: Graph { $1 }

Graph: DIGRAPH ID LACC StmtList SEMI RACC    { ($4:List<_>).Reverse(); Graph $4 }

StmtList: Stmt            { let l = new List<Element>() in l.Add($1); l }
    | StmtList SEMI Stmt  { $1.Add($3); $1 }

Stmt: 
    | ID AttrListOpt             { Node($1,$2) } 
    | ID DIEDGE ID AttrListOpt   { Edge($1,$3,$4) }
    | GRAPH AttrListOpt          { GraphAttributeList($2) }
    | NODE AttrListOpt           { NodeAttributeList($2) }
    | EDGE AttrListOpt           { EdgeAttributeList($2) }

AttrListOpt:
    |                            { new Dictionary<string,string>() }
    | LPAREN AttrList RPAREN     { $2 }

Attr: ID ASSIGN ID               { new KeyValuePair<string,string>($1,$3) }

AttrList: Attr              { let attr = new Dictionary<string,string>() in attr.Add($1.Key,$1.Value); attr }
    | AttrList COMMA Attr   { $1.Add($3.Key, $3.Value); $1 }

Notice that this is very similar to the grammar defined earlier. I guess with some experience one can derive this fsyacc spec from the grammar in a fairly straightforward way. Expect some bumps on the road the first time around though. Notice that with each rule we can again associate the transformation that needs to occur, similar as we did in the lexer. Only this time we're not producing tokens, we're producing the actual F# types that we need. The $n variables in the right part refer to the data that is carried by the nth token on the left.

Some things to be aware of:

  • If you need to parse a list of things, like the StmtList above, you need to do this "backwards", i.e. the second option in StmtList is 'StmtList SEMI Stmt', not 'Stmt SEMI StmtList'. This also means I reverse the list once read in the Graph rule. This is simply a consequence of how the parser produced by fsyacc works under the hood - in this case a minor inconvenience.
  • I chose here to use regular .NET types as much as possible, since I intended to use this code from C#. If you use F# types like list, the code actually becomes a lot shorter. For example, 'let l = new List<Element>() in l.Add($1); l" just becomes '[$1]'.
  • I deliberately reversed the order of the rules Attr and AttrList, so that F# could infer the types of the arguments better. This is a result of the fact that F# parser top to bottom, left to right. If you get a lot of type inference errors, it pays off to experiment with the order of your statements, if you can reverse them. I try to avoid adding type annotations as much as possible, they make the code harder to read and to maintain. (in C# my use of the C# 'var' keyword has increased exponentially)

That's it; next time, I'll show you how to put everything together and call the parser.

02 May 2008

FsCheck: Random testing for F#

[Updated for version 0.2]

[Update: this documentation is no longer kept up to date as of FsCheck 0.3. Please check out FsCheck on GitHub for up to date releases, information and what not.]

FsCheck is an as-literal-as-possible port of Haskell's QuickCheck 0.1. Using FsCheck, you define properties that should hold for all your functions. FsCheck checks that those properties hold by generating random values for the parameters of the properties. While writing the properties, you are actually writing a testable specification of your program.

This release is still a bit rough around the edges; basic things work, but there's for example no easy way to check all properties in a given file. Also parts of FsCheck itself are still untested (ah, the irony). I wrote FsCheck mainly as an exercise  - learn from the best, or suck like the rest, so Haskell seems like a good place to start.

The next part of this post is  a manual for using FsCheck; most of it is almost literally "adapted" from the QuickCheck manual and paper. Any errors and omissions are entirely my responsibility.

What is FsCheck?

FsCheck is a tool for testing F# programs automatically. The programmer provides a specification of the program, in the form of properties which functions should satisfy, and FsCheck then tests that the properties hold in a large number of randomly generated cases. Specifications are expressed in F#, using combinators defined in the FsCheck library. FsCheck provides combinators to define properties, observe the distribution of test data, and define test data generators.

FsCheck is currently best used to test the functions in your F# programs. I may be experimenting in the future with using it to test classes and methods, in combination with existing .NET unit testing frameworks. The generator combinators can be used already in any testing framework to easily generate a number of random values for many types. Mostly these are F# specific at this point (tuples, option values, function values), but that should be easy to remedy.

A Simple Example

A simple example of a property definition is

let prop_RevRev (xs:list<int>) = List.rev (List.rev xs) = xs  |> propl

This property asserts that the reverse of the reverse of a list of integers is the list itself. (the propl function comes with FsCheck and is explained shortly) To check the property, we load this definition in F# interactive and then invoke

> qcheck (Gen.List(Gen.Int)) prop_RevRev;;
Ok, passed 100 tests.
val it : unit = ()

As the example shows, you indicate to FsCheck the type to generate for the property (in this case, a list of ints).

When a property fails, FsCheck displays a counter-example. For example, if we define

let prop_RevId xs = List.rev xs = xs |> propl

then checking it results in

> qcheck (Gen.List(Gen.Int)) prop_RevId;;
Falsifiable, after 8 tests:
["[-2; -4; -1]"]
val it : unit = ()

Using FsCheck

To use FsCheck, you download the latest FsCheck solution at the very end of this post. Build and reference the assembly in any projects containing specifications or test data generators. You can then test properties by loading the module they are defined in into F# interactive, and calling
qcheck <generator> <property_name>;;
or by running and writing a small console application that calls the qcheck function.

What do I do if a test loops or encounters an error?

In this case we know that the property does not hold, but qcheck does not display the counter-example. There is another testing function provided for this situation. Repeat the test using

vcheck <generator> <property_name>;;

which displays each test case before running the test: the last test case displayed is thus the one in which the loop or error arises.

Properties

Properties are expressed as F# function definitions, with names beginning with prop_. Properties can be defined together with their generators, by universal quantification over their parameters using the function forAll, which takes the form

forAll <generator> (fun arg -> <property>)

The first argument of forAll is a test data generator. The second argument is a function that takes as argument one value generated by the test data generator, and returns the property that needs to be checked. The values generated by the test data generator must be of the correct type for the argument, which is checked at compile time. For example,

let prop_RevRev = 
    forAll (Gen.List(Gen.Int)) (fun xs -> 
        List.rev (List.rev xs) = xs |> propl)

will generate a number of int lists for the argument xs, and evaluate the function using the generated value.

The core of a property is an assertion function that returns a bool. Such a function should be made into a Property type by calling the function propl. This allows you to use property combinators explained below.

`Polymorphic' properties, such as the one above, are restricted to a particular type to be used for testing, which is done by specifying a generator. The anonymous function given to forAll can take only one argument: if a property needs more, they should be tupled using Gen.Tuple.

Note: thanks to Haskell's type classes, QuickCheck does not force you to use forAll or needs you to use prop. I do realize it makes the properties in F# harder to read - at the moment I see no elegant ways around this. But in FsCheck you can separate the generators from the properties using qcheck and vcheck. In the examples you can find examples of both styles. For clarity, I'll use the forAll function in the following.

Conditional Properties

Properties may take the form

<condition> ==> <property>

For example,

let rec ordered xs = match xs with
                     | [] -> true
                     | [x] -> true
                     | x::y::ys ->  (x <= y) && ordered (y::ys)
let rec insert x xs = match xs with
                      | [] -> [x]
                      | c::cs -> if x<=c then x::xs else c::(insert x cs)
let prop_Insert = forAll (Gen.Tuple(Gen.Int, Gen.List(Gen.Int))) (fun (x,xs) -> 
                    ordered xs ==> propl (ordered (insert x xs)))

Such a property holds if the property after ==> holds whenever the condition does.

Testing discards test cases which do not satisfy the condition. Test case generation continues until 100 cases which do satisfy the condition have been found, or until an overall limit on the number of test cases is reached (to avoid looping if the condition never holds). In this case a message such as
Arguments exhausted after 97 tests.
indicates that 97 test cases satisfying the condition were found, and that the property held in those 97 cases.

Since F# has eager evaluation by default, the above property does more work than necessary: it evaluates the property at the right of the condition no matter what the outcome of the condition on the left. While only a performance consideration in the above example, this may limit the expressiveness of properties as well - consider:

let prop_CheckLazy2 = forAll (Gen.Int) (fun a -> a <> 0 ==> propl (1/a = 1/a)))

Will throw an exception. However, FsCheck also provides the function prop which makes a lazy property, and so requires the F# keyword lazy to force the lazy evaluation of the condition:

let prop_CheckLazy2 = forAll (Gen.Int) (fun a -> a <> 0 ==> prop (lazy (1/a = 1/a)))

Works as expected. propl is actually a shorthand for prop (lazy ...)), but beware: the argument of propl is evaluated eagerly!

Generators

There are number of generators for some common types, but by supplying a custom generator, instead of using the default generator for that type, it is possible to control the distribution of test data. In the following example, by supplying a custom generator for ordered lists, rather than filtering out test cases which are not ordered, we guarantee that 100 test cases can be generated without reaching the overall limit on test cases:

let prop_Insert = forAll (Gen.Tuple(Gen.Int, orderedList(Gen.Int))) (fun (x,xs) -> 
                    propl (ordered (insert x xs)))

Combinators for defining generators are described below.

Observing Test Case Distribution

It is important to be aware of the distribution of test cases: if test data is not well distributed then conclusions drawn from the test results may be invalid. In particular, the ==> operator can skew the distribution of test data badly, since only test data which satisfies the given condition is used.

FsCheck provides several ways to observe the distribution of test data. Code for making observations is incorporated into the statement of properties, each time the property is actually tested the observation is made, and the collected observations are then summarized when testing is complete.

Counting Trivial Cases

A property may take the form

trivial <condition> <property>

For example,

let prop_Insert = forAll (Gen.Tuple(Gen.Int, Gen.List(Gen.Int))) 
                    (fun (x,xs) -> 
                    ordered xs ==> propl (ordered (insert x xs))
                    |> trivial (List.length xs = 0)) 

Test cases for which the condition is true are classified as trivial, and the proportion of trivial test cases in the total is reported. In this example, testing produces

> quickCheck prop_Insert;;
Ok, passed 100 tests (46% trivial).
val it : unit = ()
Classifying Test Cases

A property may take the form

classify <condition> <string> <property>

For example,

let prop_Insert2 = 
    forAll (Gen.Tuple(Gen.Int, Gen.List(Gen.Int))) (fun (x,xs) -> 
        ordered xs ==> propl (ordered (insert x xs))
        |> classify (ordered (x::xs)) "at-head"
        |> classify (ordered (xs @ [x])) "at-tail")

Test cases satisfying the condition are assigned the classification given, and the distribution of classifications is reported after testing. In this case the result is

> quickCheck prop_Insert2;;
Ok, passed 100 tests.
48% at-tail, at-head.
25% at-head.
21% at-tail.
val it : unit = ()

Note that a test case may fall into more than one classification.

Collecting Data Values

A property may take the form

collect <expression> <property>

For example,

let prop_Insert3 = 
    forAll (Gen.Tuple(Gen.Int, Gen.List(Gen.Int))) (fun (x,xs) -> 
        ordered xs ==> propl (ordered (insert x xs))
        |> collect (List.length xs))

The argument of collect is evaluated in each test case, and the distribution of values is reported. The type of this argument is printed using any_to_string. In the example above, the output is

> quickCheck prop_Insert3;;
Ok, passed 100 tests.
46% 0.
36% 1.
11% 2.
6% 3.
1% 4.
val it : unit = ()
Combining Observations

The observations described here may be combined in any way. All the observations of each test case are combined, and the distribution of these combinations is reported. For example, testing the property

let prop_Insert3 = 
    forAll (Gen.Tuple(Gen.Int, Gen.List(Gen.Int))) (fun (x,xs) -> 
        ordered xs ==> propl (ordered (insert x xs))
        |> classify (ordered (x::xs)) "at-head"
        |> classify (ordered (xs @ [x])) "at-tail"
        |> collect (List.length xs))   

produces

> quickCheck prop_Insert3;;
Ok, passed 100 tests.
46% 0, at-tail, at-head.
18% 1, at-head.
17% 1, at-tail.
6% 2, at-head.
3% 3.
2% 2.
2% 3, at-tail.
2% 2, at-tail.
1% 4.
1% 3, at-head.
1% 2, at-tail, at-head.
1% 1, at-tail, at-head.
val it : unit = ()

 
from which we see that insertion at the beginning or end of a list has not been tested for lists of four elements.

Test Data Generators: The Type Gen

Test data is produced by test data generators. FsCheck defines default generators for some often used types, but you can use your own, and will need to define your own generators for any new types you introduce.

Generators have types of the form 'a Gen; this is a generator for values of type a. The type Gen is a computation expression called gen, so F#'s computation expression syntax can be used to define generators.

Generators are built up on top of the function

val choose : (int * int -> int Gen)

which makes a random choice of a value from an interval, with a uniform distribution. For example, to make a random choice between the elements of a list, use

let chooseFromList xs = gen { let! i = choose (0, List.length xs) 
                              return (List.nth xs i) }

Note that choose does not include the upper bound of the interval, so you can use List.length without deducting 1.

Choosing Between Alternatives

A generator may take the form

oneof <list of generators>

which chooses among the generators in the list with equal probability. For example,

oneof [ gen { return true }; gen { return false } ]

generates a random boolean which is true with probability one half.

We can control the distribution of results using the function

val frequency: (int * 'a Gen) list -> 'a Gen 

instead. Frequency chooses a generator from the list randomly, but weights the probability of choosing each alternative by the factor given. For example,

frequency [ (2, gen { return true }); (1, gen { return false })]

generates true two thirds of the time.

The Size of Test Data

Test data generators have an implicit size parameter; FsCheck begins by generating small test cases, and gradually increases the size as testing progresses. Different test data generators interpret the size parameter in different ways: some ignore it, while the list generator, for example, interprets it as an upper bound on the length of generated lists. You are free to use it as you wish to control your own test data generators.

You can obtain the value of the size parameter using

val sized : ((int -> 'a Gen) -> 'a Gen)

sized g calls g, passing it the current size as a parameter. For example, to generate natural numbers in the range 0 to size, use

sized <| fun s -> choose (0,s)

The purpose of size control is to ensure that test cases are large enough to reveal errors, while remaining small enough to test fast. Sometimes the default size control does not achieve this. For example, towards the end of a test run arbitrary lists may have up to 50 elements, so arbitrary lists of lists may have up to 2500, which is too large for efficient testing. In such cases it can be useful to modify the size parameter explicitly. You can do so using

val resize : (int -> 'a Gen -> 'a Gen)

resize n g invokes generator g with size parameter n. The size parameter should never be negative. For example, to generate a random matrix it might be appropriate to take the square root of the original size:

let matrix gn = sized <| fun s -> resize (s|>float|>sqrt|>int) gn
Generating Recursive Data Types

Generators for recursive data types are easy to express using oneof or frequency to choose between constructors, and F#'s standard computation expression syntax to form a generator for each case. There are also liftGen functions for arity up to 4 (more are easy to add) to lift constructors and functions into the Gen type. For example, if the type of trees is defined by

type Tree = Leaf of int | Branch of Tree * Tree

then a generator for trees might be defined by

let rec unsafeTree = 
    oneof [ liftGen (Leaf) Gen.Int; 
            liftGen2 (fun x y -> Branch (x,y)) unsafeTree unsafeTree]

However, there is always a risk that a recursive generator like this may fail to terminate, or produce very large results. In any case, the F# compiler generates an error for the above function for exactly this reason: the function calls itself. To avoid this, recursive generators should always use the size control mechanism. For example,

let tree =
    let rec tree' s = 
        match s with
          | 0 -> liftGen (Leaf) Gen.Int
          | n when n>0 -> 
            let subtree = tree' (n/2) in
            oneof [ liftGen (Leaf) Gen.Int; 
                    liftGen2 (fun x y -> Branch (x,y)) subtree subtree]
    sized tree'

Note that

  • We guarantee termination by forcing the result to be a leaf when the size is zero.
  • We halve the size at each recursion, so that the size gives an upper bound on the number of nodes in the tree. We are free to interpret the size as we will.
  • The fact that we share the subtree generator between the two branches of a Branch does not, of course, mean that we generate the same tree in each case.
Useful Generator Combinators

If g is a generator for type t, then

  • two g generates a pair of t's,
  • three g generates a triple of t's,
  • four g generates a quadruple of t's,
  • vector n g generates a list of n t's.
  • If xs is a list, then elements xs generates an arbitrary element of xs.
Default Generators

FsCheck uses static members of the type Gen to define default test data generators for some types. FsCheck defines instances for the types (), bool, int, float, pairs, triples, quadruples, lists, and functions. You can define new ones by making new instances of 'a Gen.

To generate random functions of type 'a->'b, you need to provide an instance of type 'a-> 'b Gen -> 'b Gen, which we'll call the cogenerator. The implementation of ('a->'b) Gen uses a cogenerator for type a. If you only want to generate random values of a type, you need only to define a generator for that type, while if you want to generate random functions over the type also, then you should define a cogenerator as well.

A cogenerator function interprets a value of type a as a generator transformer. It should be defined so that different values are interpreted as independent generator transformers. These can be programmed using the function

val variant : (int -> 'a Gen -> 'a Gen)

For different natural numbers i and j, variant i g and variant j g are independent generator transformers. The argument of variant must be non-negative, and, for efficiency, should be small. Cogenerators can be defined by composing together generator transformers constructed with variant.

For example,for the type Tree defined above, a suitable instance of a cogenerator, allowing you to define functions Tree -> 'a, can be defined as follows

let rec cotree t = 
    match t with
       | (Leaf n) -> variant 0 << Co.Int n
       | (Branch (t1,t2)) -> variant 1 << cotree t1 << cotree t2
Properties of Functions

Since FsCheck can generate random function values, it can check properties of functions. For example, we can check associativity of function composition as follows:

let treegen = (Gen.Tuple(tree,three (Gen.Arrow(cotree,tree))))
let prop_Assoc gen = forAll gen (fun (x,(f,g,h)) -> 
    propl ( ((f >> g) >> h) x = (f >> ( g >> h)) x ))

where we use the generator and cogenerator for trees defined above; we thus generate functions Tree -> Tree. If a counter-example is found, function values will be displayed as "<func>".

FsCheck internals

FsCheck follows the Haskell implementation of QuickCheck almost literally. For some explanation, read the original QuickCheck paper.

In the implementation, there are still a few issues that I'm unhappy with:

  • I ported Haskell's random implementation, but I'm still not 100% satisfied with it.
  • decodeFloat is somewhat amateurish. Might even be wrong. I need to test the generator for floats more thoroughly.
  • It would be cool to have more standard generators, e.g. for strings and more .net oriented types, like Func<>, TimeSpan, DateTime and so on.

Download

Finally, download the latest Visual Studio solution from my Skydrive folder: FsCheck 0.2

Older releases:

Enjoy!