13 January 2009

A poor man's typeclass

This post contains a surprisingly short library which allows you to emulate Haskell's typeclasses in F#, except the type-safety. You might think that takes a lot away from the concept, but there is still plenty usefulness left. Not in the least it allows you to use an almost one-to-one translation of Haskell code including typeclasses to F#. At runtime both behave identically. In addition, since this is a library, its functionality can be extended beyond Haskell's typeclasses. Finally, if nothing else, you might be interested in how typeclasses work: this post shows you an implementation.

What is a typeclass, really?

If you don't know what typeclasses are, I recommend reading this chapter in the excellent Real World Haskell book. Let's look at a simple example of a typeclass from that chapter:

class BasicEq a where
    isEqual :: a -> a -> Bool

This defines a typeclass BasicEq. All instances (which are defined later) will need to provide a function isEqual, which compares two values of the instance a. An example of such an instance is:

instance BasicEq Bool where
    isEqual True  True  = True
    isEqual False False = True
    isEqual _     _     = False

This example provides an instance of BasicEq for the type Bool. A lot of people coming from an OO background compare this with an abstract class BasicEq with a subclass Bool, when in fact it is nothing like that. There is absolutely no subtyping of any kind going on. The only relevant type here is Bool; after the instance definition Haskell just knows what isEqual function to call when applied to type Bool. There is absolutely no abstract class. Even more, although Bool is an already defined type, we are adding it as an instance of BasicEq after the fact, something which is not possible in any mainstream statically typed OO language with subtyping.

Typeclasses are a form of so called ad-hoc polymorphism. It has much more in common with operator overloading as it is known in F#, than with subclassing. In fact, the type of isEqual is:

isEqual :: (BasicEq a) => a -> a -> Bool

This indicates that isEqual can be called on any instance a of the typeclass BasicEq. In fact, Haskell only knows the specific function to call when it knows the type a: then, it just looks up the functions associated with that type. So, in fact, the (BasicEq a) => in front of the arguments can (besides as a type annotation) be seen as a dictionary that maps types a to functions implementing the BasicEq typeclass functions for a. The great thing about this is that because of type inference, the actual type a is inferred without any programmer annotation. So Haskell can choose the correct function to call automatically.

So, in the end, what is a typeclass? It's just a dictionary from type to a record of functions, where on each application of a typeclass function Haskell passes along the dictionary to it's calling function implicitly (i.e. without requiring the caller to pass it in). Because of type inference, this sometimes looks like pulling a function or value out of thin air.

The idea

If a typeclass at runtime is nothing more than using type information to pass in an implicit parameter to a function, then in F# we can use respectively reflection and side effects to emulate much the same thing:

  • Using reflection, we can obtain the type arguments with which a function is called
  • Using side effects, we can build a globally accessible dictionary of types to functions, that can be accessed "transparently", without needing the caller to thread it in calls.

The rest is some syntactic sugar.

Let's first look at how we can use the library, before diving into it's internals. Usage is remarkably similar to that of typeclasses. First, we define the typeclass:

type IListOf<'a> =
    abstract ListOf : list<'a>

This looks like a traditional class or interface type definition, that represents the functions in the typeclass. This type should have exactly one type parameter, which represents the type that we'll be defining instances for.

Now, let's register this new typeclass with our library:

newTypeClass<IListOf<_>>

newTypeClass takes one generic parameter, of which only the generic type definition part is important; this registers IListOf<'a> as a typeclass with which instances for certain types 'a can be registered.

Now, let's make some instances:

type IListOfInstances() =
    static member Unit() = 
        { new IListOf<unit> with
            override x.ListOf = [()]}
    static member Bool() = 
        { new IListOf<bool> with
            override x.ListOf = [true;false] }
    static member Int() = 
        { new IListOf<int> with
            override x.ListOf = [1..20] }

That defined instances for our typeclass IListOf with three instances for types unit, bool and int. As you can see, the corresponding interface implementations should be returned as the result of static members.

These instances must now be registered with the typeclass:

registerInstances<IListOf<_>,IListOfInstances>()

A function that we need to give two type arguments: the typeclass we're registering instances for, and the the type with the static members which return the instances.

Now we can get a specific IListOf implementation for a specific type:

getInstance (typedefof<IListOf<_>>,typeof<unit>)

However, this returns an obj - not nice. We'd like a typed listOf function that can be used whenever we want a list of some typeclass instance. Let's define such a function:

let listOf<'a> = getInstance (typedefof<IListOf<_>>,typeof<'a>) |> unbox<IListOf<'a>> |> (fun l -> l.ListOf)

The function takes only a type argument - which is used to get the instance, unbox the result (if the implementation of the typeclass library is correct, the unbox should always succeed), and then call the ListOf property which should return the correct list. This function can now be used just like an overloaded function:

printfn "%A" (listOf |> List.map (fun x -> x+5))
printfn "%A" (listOf |> List.map (fun x -> if x then "pos" else "neg"))

What is going on? The first listOf returns a list of ints, while the second returns a list of bools, although it seems we're not passing it any argument or hint as to what list to produce. Magic? Of course not.

We are passing listOf information - in its type argument. The reason we don't have to explicitly mention this argument is because F# infers it for us. So this is a great way to let type inference do some work at runtime.

The implementation

The library maintains an internal dictionary of typeclass names to instances of that typeclass:

let private typeClasses = new Dictionary<_,_>()

Defining a new typeclass prepares the one value of that dictionary to hold another dictionary which will map the actual instance types to their implementation of the typeclass functions:

let newTypeClass<'typeClass> = typeClasses.Add((typeof<'typeClass>).GetGenericTypeDefinition(), new Dictionary<_,_>())

The following method then uses some reflection to find all the instances of a given typeclass, which should be defined as static members on a given class type:

let private findInstances (typeClass:Type) = 
    let addMethods l (t:Type) =
        t.GetMethods((BindingFlags.Static ||| BindingFlags.Public)) |>
        Seq.fold (fun l m ->
            match m.ReturnType with
                | GenericTypeDef typeClass args -> 
                    if (args.Length <> 1) then
                        failwithf "Typeclasses must have exactly one generic parameter. Typeclass %A has %i" typeClass args.Length
                    else
                        let instance = args.[0]
                        if instance.IsGenericType 
                            && (instance.GetGenericArguments() |> Array.for_all (fun t -> t.IsGenericParameter)) then
                            (args.[0].GetGenericTypeDefinition(), m) :: l   
                        else
                            (args.[0], m) :: l
                | _ -> l
            ) l
    addMethods []
///Register instances in a given class as instances of the given type class. 
let registerInstancesByType typeClass instance =
    findInstances typeClass instance |> Seq.iter (typeClasses.[typeClass].Add)

///Register instances in a given class as instances of the given type class.
let registerInstances<'typeClass,'instance>() = 
    registerInstancesByType (typedefof<'typeClass>) (typeof<'instance>)

The final ingredient is the lookup function, to get the implementation of a typeclass given a type:

let getInstance   =
    memoize (fun (typeClass:Type, instance:Type) ->
        let instances = typeClasses.[typeClass]
        let mi =
            match instances.TryGetValue(instance) with
            | (true, res) -> res
            | _ when instance.IsGenericType -> //exact type is not known, try the generic type
                match instances.TryGetValue(instance.GetGenericTypeDefinition()) with
                | (true, mi') -> if mi'.ContainsGenericParameters then (mi'.MakeGenericMethod(instance.GetGenericArguments())) else mi'
                | _ -> failwithf "No instances of class %A for type %A" typeClass instance 
            | _ -> failwithf "No instances of class %A for type %A" typeClass instance 
        mi.Invoke(null, Array.empty))

There is some heavy reflection going on here, which I'll let you figure out in your own time. The memoize function I took from the Expert F# book. Full implementation is downloadable below.

Conclusions

This posts shows how to get all the advantages of typeclasses in F#, except the static checking. I've found these advantages considerable:

  • The use of reflection is heavy, but it is very nicely encapsulated: all the un-typesafeness goes on strictly inside the library and definition. What goes in and what comes out is strongly typed, and from a user point of view, the reflection is completely behind the scenes.
  • Although reflection is involved, it is actually fast: there is some manipulation of  Types, but this is highly optimized in .NET. There is one reflective invocation per instance that is looked up, but the result is memoized so this can be considered startup cost. Finally, there are two constant time dictionary lookups.
  • To give a real world example, in FsCheck 0.4 (which will be out real soon) I have been able to use this simple library to translate the Arbitrary and Testable typeclasses from QuickCheck directly into F#, making property combinators more powerful. For one, there is no longer a need for the 'prop' and 'propl' functions, a feat which I was not able to accomplish using reflection alone (and I did try!). So the typeclass abstraction is a nice way to structure your code around overloading.
  • It has also allowed me to reduce the amount of reflection code in FsCheck and other projects - which is a Good Thing as reflection is always a pain to use.
  • It's a great way to let type inference work for you, also at run time.

Finally, some things for the future:

  • This library is not thread safe, although it could be made thread safe at the cost of some locking.
  • Are multi-parameter type classes possible using this technique?
  • ...

Anyway, I hope that this post has shown that with some imagination and a little work, you can do great things with F#, not in the least thanks to a lot of infrastructure in the form of .NET libraries, without which this would not have been possible.

Download the full implementation. It will also be distributed with FsCheck 0.4 (soon, I promise :).

Technorati: ,,

8 comments:

  1. Hi Kurt,

    Once again a very interesting post. Keep looking for the limits of the language.

    P.S. where can i sign the petition for adding type classes to F#?

    ReplyDelete
  2. Hi Felix,

    Thanks. As for the petition: I can't deny I am still hoping for a late Christmas present from the F# team...

    ReplyDelete
  3. Horrible and yet good stuff!

    Horrible that a NEW language drive us to this madness.

    Good stuff, this is a pratical work around. Thanks.

    If the petition where achieve its goals then we could believe that Microsoft has reinvented themselves. Hmmm...

    ReplyDelete
  4. It's possible without reflection, see

    Type Classes in F#

    http://codebetter.com/blogs/matthew.podwysocki/archive/2009/01/20/functional-programming-unit-testing-using-type-classes.aspx

    ReplyDelete
  5. Hi Wirbel,

    Yes, Matthew's post outlines basically the same approach (it even links to this post...). Anyway, the reflection is only used to automate the discovery of type class instances - see the implementation of ApproxEqAssociations.Get in Matthew's post. This code would have to be repeated for each type class because it is impossible to abstract over the IApproxEq part in IApproxEq<'a>. Using reflection, we can get around this (by being statically type unsafe, of course).

    ReplyDelete
  6. I, of course, a newcomer to this blog, but the author does not agree

    ReplyDelete
  7. Is it possible to use this technique with something that is a kind such as Functor or Monad? I'm trying at the moment but it isn't working out.

    ReplyDelete
  8. I imagine it would, but it doesn't solve the problem of how to define the signatures that contain higher kinds. I'm guessing that's where your problem is, not in any issues with typeclasses themselves.

    ReplyDelete