Forty Six and Two: January 2009

It's been a month and a half since the last release of FsCheck. Time for an update!

Major changes:

Rewrote the back end using typeclasses. The result: much less reflection, cleaner code, and improved flexibility in the property combinators. In particular, the 'prop' and 'propl' combinators can now be omitted. Properties can take any nunber of arguments, that do not need to be tupled. Generators can now be more easily defined, and function generators are also derived based on type.
Two major new features were contributed generously by Howard Mansell and team at Credit Suisse. Most of the implementation was done by Neil Mitchell, who has also helped me a lot in integrating the changes in the main FsCheck branch. Thank you both!
The features are:
- Automatic generation of records, discriminated unions (including recursive ones), arrays and tuples. So once you define a generator for a type, FsCheck can now automatically generate lists, arrays, record types, functions, discriminated unions and options involving that type.
- Shrinking. Especially with recursive datatypes, counter examples can become quite large. FsCheck now automatically shrinks these examples to a smaller one, making bug finding lots easier.

Minor changes:

Some performance improvements
Improvements in FsCheck's output.
Factored out the function that prints the result of a test, so you can use it in your IRunner implementation if necessary
Because FsCheck uses less reflection, stack traces of exceptions when a test case fails are shorter and clearer

Breaking changes:

quickCheck no longer checks types; instead use quickCheckAll, verboseCheckAll or checkAll. quickCheck is the preferred way to check properties: it is faster, clearer and more flexible.
You'll need to redefine some of your custom generators; the mechanism to define and register generators has changed. See the manual for details, an don't hesitate to ask questions.

This is a bit of an intermediary release - for v0.5 I'd like to integrate shrinking better, so you can define your own shrinkers for a given datatype. Should be straightforward now that I've reworked the back-end, but I wanted to get this release out first - I think shrinking and the reflective datatype generators by themselves are worth a new release. Most of the churn should be over now, I'm finally happy now with the way a user can define new properties and generators.

Enjoy!

Technorati: FsCheck,quickcheck,F#

This post contains a surprisingly short library which allows you to emulate Haskell's typeclasses in F#, except the type-safety. You might think that takes a lot away from the concept, but there is still plenty usefulness left. Not in the least it allows you to use an almost one-to-one translation of Haskell code including typeclasses to F#. At runtime both behave identically. In addition, since this is a library, its functionality can be extended beyond Haskell's typeclasses. Finally, if nothing else, you might be interested in how typeclasses work: this post shows you an implementation.

What is a typeclass, really?

If you don't know what typeclasses are, I recommend reading this chapter in the excellent Real World Haskell book. Let's look at a simple example of a typeclass from that chapter:

class BasicEq a where
isEqual :: a -> a -> Bool

This defines a typeclass BasicEq. All instances (which are defined later) will need to provide a function isEqual, which compares two values of the instance a. An example of such an instance is:

instance BasicEq Bool where
    isEqual True True = True
    isEqual False False = True
    isEqual _     _     = False

This example provides an instance of BasicEq for the type Bool. A lot of people coming from an OO background compare this with an abstract class BasicEq with a subclass Bool, when in fact it is nothing like that. There is absolutely no subtyping of any kind going on. The only relevant type here is Bool; after the instance definition Haskell just knows what isEqual function to call when applied to type Bool. There is absolutely no abstract class. Even more, although Bool is an already defined type, we are adding it as an instance of BasicEq after the fact, something which is not possible in any mainstream statically typed OO language with subtyping.

Typeclasses are a form of so called ad-hoc polymorphism. It has much more in common with operator overloading as it is known in F#, than with subclassing. In fact, the type of isEqual is:

isEqual :: (BasicEq a) => a -> a -> Bool

This indicates that isEqual can be called on any instance a of the typeclass BasicEq. In fact, Haskell only knows the specific function to call when it knows the type a: then, it just looks up the functions associated with that type. So, in fact, the (BasicEq a) => in front of the arguments can (besides as a type annotation) be seen as a dictionary that maps types a to functions implementing the BasicEq typeclass functions for a. The great thing about this is that because of type inference, the actual type a is inferred without any programmer annotation. So Haskell can choose the correct function to call automatically.

So, in the end, what is a typeclass? It's just a dictionary from type to a record of functions, where on each application of a typeclass function Haskell passes along the dictionary to it's calling function implicitly (i.e. without requiring the caller to pass it in). Because of type inference, this sometimes looks like pulling a function or value out of thin air.

The idea

If a typeclass at runtime is nothing more than using type information to pass in an implicit parameter to a function, then in F# we can use respectively reflection and side effects to emulate much the same thing:

Using reflection, we can obtain the type arguments with which a function is called
Using side effects, we can build a globally accessible dictionary of types to functions, that can be accessed "transparently", without needing the caller to thread it in calls.

The rest is some syntactic sugar.

Let's first look at how we can use the library, before diving into it's internals. Usage is remarkably similar to that of typeclasses. First, we define the typeclass:

type IListOf<'a> =
    abstract ListOf : list<'a>

This looks like a traditional class or interface type definition, that represents the functions in the typeclass. This type should have exactly one type parameter, which represents the type that we'll be defining instances for.

Now, let's register this new typeclass with our library:

newTypeClass<IListOf<_>>

newTypeClass takes one generic parameter, of which only the generic type definition part is important; this registers IListOf<'a> as a typeclass with which instances for certain types 'a can be registered.

Now, let's make some instances:

type IListOfInstances() =
    static member Unit() = 
        { new IListOf<unit> with
            override x.ListOf = [()]}
    static member Bool() = 
        { new IListOf<bool> with
            override x.ListOf = [true;false] }
    static member Int() = 
        { new IListOf<int> with
            override x.ListOf = [1..20] }

That defined instances for our typeclass IListOf with three instances for types unit, bool and int. As you can see, the corresponding interface implementations should be returned as the result of static members.

These instances must now be registered with the typeclass:

registerInstances<IListOf<_>,IListOfInstances>()

A function that we need to give two type arguments: the typeclass we're registering instances for, and the the type with the static members which return the instances.

Now we can get a specific IListOf implementation for a specific type:

getInstance (typedefof<IListOf<_>>,typeof<unit>)

However, this returns an obj - not nice. We'd like a typed listOf function that can be used whenever we want a list of some typeclass instance. Let's define such a function:

let listOf<'a> = getInstance (typedefof<IListOf<_>>,typeof<'a>) |> unbox<IListOf<'a>> |> (fun l -> l.ListOf)

The function takes only a type argument - which is used to get the instance, unbox the result (if the implementation of the typeclass library is correct, the unbox should always succeed), and then call the ListOf property which should return the correct list. This function can now be used just like an overloaded function:

printfn "%A" (listOf |> List.map (fun x -> x+5))
printfn "%A" (listOf |> List.map (fun x -> if x then "pos" else "neg"))

What is going on? The first listOf returns a list of ints, while the second returns a list of bools, although it seems we're not passing it any argument or hint as to what list to produce. Magic? Of course not.

We are passing listOf information - in its type argument. The reason we don't have to explicitly mention this argument is because F# infers it for us. So this is a great way to let type inference do some work at runtime.

The implementation

The library maintains an internal dictionary of typeclass names to instances of that typeclass:

let private typeClasses = new Dictionary<_,_>()

Defining a new typeclass prepares the one value of that dictionary to hold another dictionary which will map the actual instance types to their implementation of the typeclass functions:

let newTypeClass<'typeClass> = typeClasses.Add((typeof<'typeClass>).GetGenericTypeDefinition(), new Dictionary<_,_>())

The following method then uses some reflection to find all the instances of a given typeclass, which should be defined as static members on a given class type:

let private findInstances (typeClass:Type) = 
    let addMethods l (t:Type) =
        t.GetMethods((BindingFlags.Static ||| BindingFlags.Public)) |>
        Seq.fold (fun l m ->
            match m.ReturnType with
                | GenericTypeDef typeClass args -> 
                    if (args.Length <> 1) then
                        failwithf "Typeclasses must have exactly one generic parameter. Typeclass %A has %i" typeClass args.Length
                    else
                        let instance = args.[0]
                        if instance.IsGenericType 
                            && (instance.GetGenericArguments() |> Array.for_all (fun t -> t.IsGenericParameter)) then
                            (args.[0].GetGenericTypeDefinition(), m) :: l   
                        else
                            (args.[0], m) :: l
                | _ -> l
            ) l
    addMethods []

///Register instances in a given class as instances of the given type class. 
let registerInstancesByType typeClass instance =
    findInstances typeClass instance |> Seq.iter (typeClasses.[typeClass].Add)

///Register instances in a given class as instances of the given type class.
let registerInstances<'typeClass,'instance>() = 
    registerInstancesByType (typedefof<'typeClass>) (typeof<'instance>)

The final ingredient is the lookup function, to get the implementation of a typeclass given a type:

let getInstance   =
    memoize (fun (typeClass:Type, instance:Type) ->
        let instances = typeClasses.[typeClass]
        let mi =
            match instances.TryGetValue(instance) with
            | (true, res) -> res
            | _ when instance.IsGenericType -> //exact type is not known, try the generic type
                match instances.TryGetValue(instance.GetGenericTypeDefinition()) with
                | (true, mi') -> if mi'.ContainsGenericParameters then (mi'.MakeGenericMethod(instance.GetGenericArguments())) else mi'
                | _ -> failwithf "No instances of class %A for type %A" typeClass instance 
            | _ -> failwithf "No instances of class %A for type %A" typeClass instance 
        mi.Invoke(null, Array.empty))

There is some heavy reflection going on here, which I'll let you figure out in your own time. The memoize function I took from the Expert F# book. Full implementation is downloadable below.

Conclusions

This posts shows how to get all the advantages of typeclasses in F#, except the static checking. I've found these advantages considerable:

The use of reflection is heavy, but it is very nicely encapsulated: all the un-typesafeness goes on strictly inside the library and definition. What goes in and what comes out is strongly typed, and from a user point of view, the reflection is completely behind the scenes.
Although reflection is involved, it is actually fast: there is some manipulation of Types, but this is highly optimized in .NET. There is one reflective invocation per instance that is looked up, but the result is memoized so this can be considered startup cost. Finally, there are two constant time dictionary lookups.
To give a real world example, in FsCheck 0.4 (which will be out real soon) I have been able to use this simple library to translate the Arbitrary and Testable typeclasses from QuickCheck directly into F#, making property combinators more powerful. For one, there is no longer a need for the 'prop' and 'propl' functions, a feat which I was not able to accomplish using reflection alone (and I did try!). So the typeclass abstraction is a nice way to structure your code around overloading.
It has also allowed me to reduce the amount of reflection code in FsCheck and other projects - which is a Good Thing as reflection is always a pain to use.
It's a great way to let type inference work for you, also at run time.

Finally, some things for the future:

This library is not thread safe, although it could be made thread safe at the cost of some locking.
Are multi-parameter type classes possible using this technique?
...

Anyway, I hope that this post has shown that with some imagination and a little work, you can do great things with F#, not in the least thanks to a lot of infrastructure in the form of .NET libraries, without which this would not have been possible.

Download the full implementation. It will also be distributed with FsCheck 0.4 (soon, I promise :).

Technorati: F#,typeclass,haskell

Forty Six and Two

24 January 2009

How to deal with out parameters in F#

14 January 2009

Announcing FsCheck 0.4

13 January 2009

A poor man's typeclass

What is a typeclass, really?

The idea

The implementation

Conclusions

Subscribe

Search This Blog

Labels

Blog Archive

About me...

Quotes