27 June 2009

More fun with data structures, continuations and call/cc

Last time I showed a few ways of implementing insert into an unbalanced binary tree, as an implementation of a set data structure. This post adds an implementation that is similar in functionality to the example that uses call/cc, but using the more familiar exceptions to avoid allocating elements needlessly:

let insertExc x t =
    try
        let rec insertRec x t =
            match t with
            | Empty ->  cont { return Elem(empty,x,empty) }
            | Elem (left,e,right) when x < e -> 
                cont {  let! t' = insertRec x left
                        return Elem(t',e,right) }
            | Elem (left,e,right) when x > e -> 
                cont {  let! t' = insertRec x right
                        return Elem(left,e,t') }
            | Elem _ -> failwith "alreadyExists"
        insertRec x t id
    with e –> t

Performance shootout

A comment on the last post indicated that exceptions should be much slower (on .NET, at least) than the callCC approach. Furthermore, I said that callCC was necessary to avoid re-allocating elements needlessly on the search path to an already existing element. Let’s put these claims to the test with a simple benchmark:

let longList = 
    let r = new System.Random()
    let l = 100000
    [ for i in 1..l -> r.Next(0,l) ]

let insertALot f nbRuns =
    for i in 1..nbRuns do
        longList |> List.fold (fun t elem -> f elem t) empty

How many duplicates are in the long list (i.e. how much can we expect to gain by using callCC/exceptions):

> longList |> Seq.distinct |> Seq.length;;
val it : int = 63362
So about 35% duplicates. The results:
> insertALot insert 100;;
Real: 00:01:38.717, CPU: 00:01:37.516, GC gen0: 2247, gen1: 520, gen2: 8
val it : unit = ()
> insertALot insert' 100;;
Real: 00:01:44.583, CPU: 00:01:37.251, GC gen0: 4050, gen1: 1444, gen2: 18
val it : unit = ()
> insertALot insertM 100;;
Real: 00:02:35.101, CPU: 00:02:29.792, GC gen0: 12698, gen1: 1875, gen2: 24
val it : unit = ()
> insertALot insertCallCC 100;;
Real: 00:01:09.410, CPU: 00:01:02.540, GC gen0: 12149, gen1: 7, gen2: 0
val it : unit = ()
> insertALot insertExc 100;;

- Interrupt

Observe that:

  • the readability of computation expressions comes with a cost;
  • even with that cost, the callCC code is still significantly faster;
  • Exceptions are indeed horribly slow – I interrupted after about an hour.
Share this post : Technet! del.icio.us it! del.iri.ous! digg it! dotnetkicks it! reddit! technorati!

24 June 2009

Fun with data structures, continuations and call/cc

Prompted by exercise 2.3 in Purely Functional Data Structures, by Chris Okasaki.

type Tree<'a> = Empty | Elem of 'a Tree * 'a * 'a Tree

let empty = Empty
Recursive
let rec insert x t =
    match t with
    | Empty ->  Elem(empty,x,empty)
    | Elem (left,e,right) when x < e -> Elem (insert x left,e,right)
    | Elem (left,e,right) when x > e -> Elem (left,e, insert x right)
    | Elem _ –> t
Tail Recursive
let insert' x t =
    let rec cont x t k =
        match t with
        | Empty -> k (Elem(empty,x,empty))
        | Elem (left,e,right) when x < e -> cont x left (fun t' -> k <| Elem (t',e,right))
        | Elem (left,e,right) when x > e -> cont x right (fun t' -> k <| Elem (left,e,t'))
        | Elem _ -> k t
    cont x t id
Using a continuation monad
type Cont<'a,'r> = ('a -> 'r) -> 'r

type ContBuilder() =
    member x.Return (a):Cont<'a,'r> = fun k -> k a
    member x.Bind (c:Cont<'a,'r>, f:'a -> Cont<'b,_>) =
        (fun k -> c (fun a -> f a k))
    member this.Delay(f) = f()

let cont = ContBuilder()
let insertM x t =
    let rec insertRecM x t =
        match t with
        | Empty ->  cont { return Elem(empty,x,empty) }
        | Elem (left,e,right) when x < e -> 
            cont {  let! t' = insertRecM x left
                    return Elem(t',e,right) }
        | Elem (left,e,right) when x > e -> 
            cont {  let! t' = insertRecM x right
                    return Elem(left,e,t') }
        | Elem _ -> cont { return t }
    insertRecM x t id
Using call with current continuation
let callCC (f:('a -> Cont<'b,'r>) -> Cont<'a,'r>) : Cont<'a,'r> = fun k -> f (fun a _ -> k a) k
let insertCallCC x t =
    callCC (fun alreadyExists ->
        let rec insertRecCallCC x t =
            match t with
            | Empty ->  cont { return Elem(empty,x,empty) }
            | Elem (left,e,right) when x < e -> 
                cont {  let! t' = insertRecCallCC x left
                        return Elem(t',e,right) }
            | Elem (left,e,right) when x > e -> 
                cont {  let! t' = insertRecCallCC x right
                        return Elem(left,e,t') }
            | Elem _ -> alreadyExists t
        insertRecCallCC x t
    ) id

Please note that the complexity of the last implementation is not obfuscation – it avoids recreating nodes on the path to an already existing element (exercise 2.3 actually asks to do it using exceptions, which works just as well).

Share this post : Technet! del.icio.us it! del.iri.ous! digg it! dotnetkicks it! reddit! technorati!

13 June 2009

Traversing and transforming F# quotations: A guided tour

There seems to be some lack of information about quotations – most is either short, a specific example, or is a bit outdated. The most comprehensive example is probably Tomáš Petříček’s Quotations Visualizer in the  F# samples. His site is also one of the better resources about quotations that I’ve found. Another one with some recent activity is Alex Pedenko’s blog. Furthermore, even the otherwise excellent Expert F# book has little to say about it, and predates the major changes that were made in the September CTP.

Here I’ll focus on the important general patterns of using quotations. That is, I’m going to show how to traverse and transform quotations, independent of any specific example. I’m going to focus more on the API and programming aspects, than on introductory concepts, so this assumes some intermediate-level knowledge of F#. Basic knowledge about quotations doesn’t hurt either, as I‘ll make explanations terse.

Either way, this post represents some knowledge that took me a while to figure out on my own, piecing together scattered sources.

I would like to point out that once you get over the learning curve, working with quotations is actually enjoyable. I’ve been refraining from using them because I was put off by some bad experiences with System.Reflection (I think System.Reflection truly sucks as an API) and because I felt that to do anything useful, you’d almost be forced to support (i.e. at the very least provide default implementations of) every single construct in F#. It turns out that the quotations API is very enjoyable to work with and that there are built-in means to start small and focus on the things that are interesting to you.

The basics

In short, a quotation is metadata that represents the code of a particular function or code snippet, that gets compiled into the dll (along with the actual executable code, in some cases) by the F# compiler. To those of you who are familiar with LINQ Expression trees, it’s the same concept only F#’s quotations cover the entire reach of the F# language.

There are three ways to obtain a quotation:

1) Use the <@  @> operation around a code snippet to get a typed quotation of type Expr<’a>, where ‘a is the type of the actual value you put into the snippet. The Expr “wrapper” essentially indicates that we’re not dealing with an actual value of type ‘a, but instead with a reified representation of that value – the quotation:

let typedExpr = <@ 1 + 5 @>

let typedExpr2 = <@ fun i -> i + 5 @>

Sending these to FSI gives:

val typedExpr : Quotations.Expr<int> =
  Call (None, Int32 op_Addition[Int32,Int32,Int32](Int32, Int32),
      [Value (1), Value (5)])
val typedExpr2 : Quotations.Expr<(int -> int)> =
  Lambda (i,
        Call (None, Int32 op_Addition[Int32,Int32,Int32](Int32, Int32),
              [i, Value (5)]))

First notice the types: the first is Expr<int> because 1 + 5 of course is an int value. The second example illustrates a function value, of type Expr<int->int>.

FSI has printed the values of the quotations: a representation of the code! In fact, as we’ll soon see, the Expr type is nothing more than a discriminated union with one case for each possible language construct in F# (it’s not actually a discriminated union under the covers, but that’s how it’s abstracted, and the abstraction is very clean. It doesn’t hurt to think about it that way.)

2) Use the <@@  @@> operator around a code snippet to get an untyped quotation, of type Expr. Expr<’a> is a subtype of Expr, and every Expr<’a> can be converted to an Expr by calling .Raw:

let untypedExpr = <@@ 1 + 5 @@>

let untypedExpr2 = <@@ fun i -> i + 5 @@>

Sending to FSI gives:

val untypedExpr : Quotations.Expr =
  Call (None, Int32 op_Addition[Int32,Int32,Int32](Int32, Int32),
      [Value (1), Value (5)])
val untypedExpr2 : Quotations.Expr =
  Lambda (i,
        Call (None, Int32 op_Addition[Int32,Int32,Int32](Int32, Int32),
              [i, Value (5)]))

Pretty much exactly the same information, except for the type. For library writers, the Expr type does all the heavy lifting of inspecting and transforming quotations: since in a library you almost always want to write some code for any quotation, you cannot write this code in a generic way with typed quotations.

3) Using the ReflectedDefinition attribute on a top level function gives you the best of both worlds: the function can be called and executed as normal, but it can also be inspected as a quotation:

[<ReflectedDefinitionAttribute>]
let quotation i = i + 5

Based on a MethodInfo, the Expression.TryGetReflectedDefinition method allows you to get to the Expr associated with the function. The necessary metadata is actually marshaled into the dll when this function is compiled.

The Microsoft.FSharp.Quotations namespace

This namespace in the FSharp core libraries contains three modules and three types. The types are Expr, Expr<’a> and Var. The first two we’ve discussed already, the third one is just the representation of a variable.

The meat of quotations processing is in the module Quotations.Patterns and the static methods on Expr. If you compare these two, you’ll notice that for every active pattern in the Patterns modules, there is a static method on Expr. For example, you have Patters.Call(…) and Expr.Call(…), with equivalent signatures. This is no coincidence: the Patterns module contains all the patterns you’ll need to deconstruct a quotation expression. Together, these patterns can traverse any F# abstract syntax tree. It contains 32 active patterns (that’s a lot, but don’t despair yet: it gets better. ).

Its dual, the static methods on the Expr type, construct quotation expressions programmatically.

If you know F#, it shouldn’t be too hard to figure out what the arguments mean. For example:

Expr.Call : obj:Expr * methodInfo:MethodInfo * arguments:Expr list –> Expr

constructs an expression that represents a method call (defined by a methodInfo) on a target object (i.e. it’s an instance method), taking the given  arguments. Conversely,

( |Call|_| ) : Expr -> (Expr option * MethodInfo * Expr list) option

is an active pattern that matches a method call, with an (optional) target, methodinfo and a list of arguments.

This duality makes it fairly easy to get started with constructing and deconstructing quotations. Try to make some quotations and print them using fsi – it will give you a good idea of what code constructs the patterns represent. Alternatively, use the quotation visualizer mentioned above.

Another module, Quotations.DerivedPatterns, contains additional active patterns that are not needed in the strict sense, but that represent some common use cases for deconstructing a quotation. An example is

( |SpecificCall|_| ) : Expr -> (Expr -> (Type list * Expr list) option)

which is a variant of Call that takes an expression and matches if the quotation is the given method or function call. The beauty of this particular active pattern is that you can use quotations to match on the given quotation:

let recognizePlus quotation =
    match quotation with
    | SpecificCall <@ (+) @> (types,exprs) -> printfn "Plus! types=%A  exprs=%A" types exprs
    | _ -> printfn "something else"
> recognizePlus <@ 1 + 2 @>;;
Plus! types=[System.Int32; System.Int32; System.Int32]  exprs=[Value (1); Value (2)]

Come on, this is still a lot of work, isn’t it?

That last example has one major shortcoming, which will hit you pretty hard when you’re trying to do anything useful with quotations. Even when you’re only interested in one particular kind of node (say, you’re interested in doing something with all (+) calls), you still have to walk the whole expression tree in order to be sure that you’ve seen all the method calls. Consider:

> recognizePlus <@ 1 * (2 + 3) @>;;
something else

Why? Because the tree looks like:

Call (None, Int32 op_Multiply[Int32,Int32,Int32](Int32, Int32),
      [Value (1),
       Call (None, Int32 op_Addition[Int32,Int32,Int32](Int32, Int32),
             [Value (2), Value (3)])])

so we’re calling multiply first – and our function doesn’t do anything with that. In order to find all plus calls, we need to look through the whole tree recursively – including the multiply call’s children. But doesn’t that mean that we’ll have to rewrite our recognizePlus function to incorporate a match on all of the 32 active patterns in Quotations.Patterns (or at least those that return child expressions, which is pretty much all of them) just to be able to traverse the whole expression tree looking for a plus call?

Luckily, no. This is were the third module in the Quotations namespace comes in.

The secret sauce: Quotations.ExprShape

This module is really the workhorse of pretty much anything you’ll do with quotations. Now, I believe this will come as a surprise to most people – in fact, it took me a bit of luck (when I saw it being used in passing, in a post on hubfs) to really understand what this module was all about. It was one of those moments for which a beautiful German word exists: Aha-Erlebnis.

This module’s mystery is in no small part thanks to its“terse” documentation. In fact, it’s so short I’ll reproduce it completely here:

val RebuildShapeCombination : obj * Expr list -> Expr Re-build combination expressions. The first parameter should be an object returned by the ShapeCombination case of the active pattern in this module.

val ( |ShapeVar|ShapeLambda|ShapeCombination| ) :
  Expr -> Choice<Var,(Var * Expr),(obj * Expr list)>

An active pattern that performs a complete decomposition viewing the expression tree as a binding structure.

First, look at the active pattern. The key thing is that this active pattern is complete: every quotation will match with one of the three cases of the active pattern: either it is a variable (ShapeVar), a lambda (ShapeLambda)  or a “combination” (ShapeCombination). So, we now have a general and complete decomposition of any quotation, and it is short:

let workhorse quotation =
    match quotation with
    | ShapeVar v -> ()
    | ShapeLambda (v,expr) -> ()
    | ShapeCombination (o, exprs) -> ()

Notice if you paste this in an .fs file that the compiler does not complain about any missing match cases, reinforcing my point. So, now we can do this recursively, which gives us:

let rec traverse quotation =
    match quotation with
    | ShapeVar v -> ()
    | ShapeLambda (v,expr) -> traverse expr
    | ShapeCombination (o, exprs) -> List.map traverse exprs |> ignore

Here’s the beauty: this simple function traverses all the expressions and sub-expressions in any quotation. You don’t have to deal with all 32 cases in order to do something with one node. This function should be the basis of all your work with traversing quotations: anything you want to do, as far as inspecting the tree goes, can be built on top of this simple structure. Let’s revisit our earlier example:

let rec recognizePlus' quotation =
    match quotation with
    | SpecificCall <@ (+) @> (types,exprs) -> 
        printfn "Plus! types=%A  exprs=%A" types exprs
        List.map recognizePlus' exprs |> ignore
    | ShapeVar v -> ()
    | ShapeLambda (v,expr) -> recognizePlus' expr
    | ShapeCombination (o, exprs) -> List.map recognizePlus' exprs |> ignore

Let’s run this on our failing example of earlier:

> recognizePlus' <@ 1 * (2 + 3) @>;;
Plus! types=[System.Int32; System.Int32; System.Int32]  exprs=[Value (2); Value (3)]
It sure found the plus now.

Transforming quotations

So far, we’ve just been inspecting the quotation. Now let’s actually do something with it. Again, the ExprShape module will be a big win: the workhorse here is the ShapeCombination and RebuildShapeCombination duality.

You might have noticed the o parameter in the ShapeCombination; this is an opaque marker that stores how the sub-expressions in a shape combination can be put back together , i.e. whether it is a function call, a let binding, or anything else that has child expressions. This is were the function RebuildShapeCombination comes into play: given an o marker and the right number and type of child expressions, it can put an expression deconstructed by ShapeCombination back together . So, whereas above we wrote a general traversal function for quotations, a general transformation function looks like:

let rec transform quotation =
    match quotation with
    | ShapeVar v -> Expr.Var v
    | ShapeLambda (v,expr) -> Expr.Lambda (v,transform expr)
    | ShapeCombination (o, exprs) -> RebuildShapeCombination (o,List.map transform exprs)

This function takes any quotation and transforms it into itself, visiting, deconstructing and reconstructing every node in the expression tree along the way.

If you’re familiar with a zipper data structure, this reminds me of it: the active patterns “open the zipper” and the RebuildShapeCombination function “closes the zipper”. (Actually, I’m not sure that the sub-expressions get reused or are made anew, but it seemed familiar anyway.)

So, let’s do some transformation: let’s replace all calls to (+) with calls to (-). (Maniacal laugh and organ playing):

let rec evilTransform quotation =
    let minus l r = <@@ %%l - %%r @@>
    match quotation with
    | SpecificCall <@ (+) @> (types,l::r::[]) -> 
        minus (evilTransform l) (evilTransform r)
    | ShapeVar v -> Expr.Var v
    | ShapeLambda (v,expr) -> Expr.Lambda (v,evilTransform expr)
    | ShapeCombination (o, exprs) -> RebuildShapeCombination (o,List.map evilTransform exprs)

Ignoring the specifics of the actual transformation, which I’ll get to in a moment, notice that the structure of the transformation is exactly the same as above, except now we’ve added one pattern to deal with the case we’re interested in: a call to addition.  Essentially, this is saying: do something special for addition, and just traverse the tree, leaving it unchanged, for all the rest.

Splicing

Now, I know I said earlier that the Expr.Call static method is the dual of the Call active pattern, so you’d expect that I would have used that in the result of the SpecificCall match above. I could have done that, but instead I chose to demonstrate another useful technique for constructing expressions: splicing, denoted by the % or %% operator, for typed and untyped splicing respectively. All this operator does is insert an expression in a certain place (or places) into a quotation. So the minus function constructs a quotation that represents the subtraction of two given expressions (this doesn’t always make sense – in that case we’ll get a runtime exception with untyped quotations). Splicing is a very readable and versatile way of constructing expressions, because you can actually write your expressions in F# instead of using Expr.Call and friends.

The proof of the pudding

In the F# powerpack, there are some useful (but somewhat experimental) functions to actually compile and evaluate any F# quotation. The process is somewhat convoluted: the quotation trees are first translated to .NET Expressions, and then that API is used to compile them. This comes with a significant performance hit: code compiled using that route can be a factor of 5 slower. So don’t use it for optimization. So far, I haven’t run into any limitations yet, and it’s a nice way to check whether everything works:

let before = <@ 1 + 2 * 3 + 5 @>
let after = evilTransform <@ 1 + 2 * 3 + 5 @>
printfn "%A" <| before.EvalUntyped()
printfn "%A" <| after.EvalUntyped()

Shows:

12
-10

val evilTransform : Expr -> Expr
val before : Expr<int> =
  Call (None, Int32 op_Addition[Int32,Int32,Int32](Int32, Int32),
      [Call (None, Int32 op_Addition[Int32,Int32,Int32](Int32, Int32),
             [Value (1),
              Call (None, Int32 op_Multiply[Int32,Int32,Int32](Int32, Int32),
                    [Value (2), Value (3)])]), Value (5)])
val after : Expr =
  Call (None, Int32 op_Subtraction[Int32,Int32,Int32](Int32, Int32),
      [Call (None, Int32 op_Subtraction[Int32,Int32,Int32](Int32, Int32),
             [Value (1),
              Call (None, Int32 op_Multiply[Int32,Int32,Int32](Int32, Int32),
                    [Value (2), Value (3)])]), Value (5)])

Pretty cool, and it opens up all kinds of interesting possibilities. For example, you can write a lazy transformer, which inserts lazy and force in the appropriate places to simulate call-by-need semantics in F#. You can add side-effects to print the result of each call as it is evaluated. Or you can transform the expression in something completely different – SQL, PHP, FPGA instructions, whatever you want.

Conclusions

The three main things you should know about the quotations API:

  1. Quotations.Expr and Quotations.Patterns are each other’s dual: they are used for constructing and deconstructing Expr types respectively;
  2. Quotations.ExprShape contains the workhorse of any quotations-using implementation: a zipper-like pattern for traversing and transforming any quotation in a straightforward way;
  3. Splicing is used to effectively and readably construct parameterized expressions as an alternative to Expr.Whatever calls.

Like I said before, I think the API strikes an excellent balance between simplicity and abstraction. F#’s features come together nicely to provide a clean development experience. I was somewhat afraid of quotations before, because I had an image in my mind that it would be messy, lots of work, and lots of special cases. Turns out it isn’t, and I hope this post will do its part in getting some more attention to this aspect of F#.

Download the example .fs file. Don’t forget to add references to FSharp.PowerPack and FSharp.PowerPack.Linq, if you want to run the dynamic compilation examples.

Technorati:

Share this post : Technet! del.icio.us it! del.iri.ous! digg it! dotnetkicks it! reddit! technorati!