04 January 2012

API design: Record types and backwards compatibility

If you’re designing an API in F#, be very careful when exposing any of the public types as record types. Record types, as they stand in F# 2.0, by default are impossible to change while keeping backwards binary compatibility.

Let’s look at a couple of changes that you might want to make to a record type:

  • Adding a new field.
  • Changing the type of a field.
  • Changing the order of the fields.

None of these changes are backwards compatible. A record type is compiled by F# to a normal .NET class, with a constructor that takes the fields as arguments. This means order of the fields matters, and also the number and types of fields of course. Accessing the record fields is accomplished through a getter for each of the fields. So if your clients are using a very limited usage pattern – basically only accessing an existing record type using a getter – you may be alright with 1 and 3, and even 2 if they don’t happen to access the particular field whose type you’re changing. Anything else, including with syntax, is a no-no.

To make this abundantly clear – a record type definition and usage:

type MyRecord =
    { Field : int
      SecondField : string }

let instance = { Field = 3; SecondField = "3" }

Is translated by the F# compiler to:

type MyRecordTranslated(field:int,secondfield:string) =
    member this.Field = field
    member this.SecondField = secondfield

let instanceTranslated = new MyRecordTranslated(3,"3")

Now in the translated case, it’s intuitively clear that changing the order of the arguments in the constructor is not backwards compatible. With the record syntax however, the F# syntax makes the record type look like a bag of fields. However, the compiler looks up the one and only constructor for the type, and explicitly calls that. So, if you change the constructor in any way, clients are going to fail at runtime if they are not recompiled first.

Two solutions (sort of)

The first solution is not to use record types as part of a public API that needs to be backwards compatible – use class types instead.

If record types are still handy, say because they have automatic value-based comparison and equality, then with some planning you can still use them – but to your clients they won’t look much like record types anymore, because we’re going to prevent clients from accessing the constructor and getters (unfortunately there’s no way to set accessibility on those two separately). And there is a lot of tedious code involved. Here’s an example – here’s the first version of a “record type” that can be kept backwards compatible:

module A_v0 =
    type MyRecord =
        internal { _Field : int } with
        static member Create(field:int) = { _Field = field }
        member this.Field = this._Field
        member this.With(?Field:int) = { this with _Field = defaultArg Field this.Field }  

Note that the most important change is that we made the constructor internal (private does not make much sense, as everything in the assembly of the record type itself should be trivial to update in concert with any change to the record type). To create the record type from outside of the assembly there is the factory method ‘Create’. Note that using F# method call syntax, we can still make this look much like a record type: ‘MyRecord.Create(field=3)’ for example.

Then we need to provide a getter for each field ourselves (because we’ve made them internal...). I’ve chosen to start the actual field names with an underscore above, just to allow the explicit getters.

Finally, using optional fields, we can regain some sort of with syntax. Here’s an example of usage:

let a = A_v0.MyRecord.Create(2).With(Field = 3).Field

Now, suppose we want to add a new field. We can do that as follows, without breaking any clients:

module A_v1 =
    type MyRecord =
        internal { _Field : int 
                   _Foo   : string } with
        static member Create(field:int) = { _Field = field; _Foo = "default" }
        static member Create(field:int,foo:string) = { _Field = field; _Foo = foo }
        member this.Field = this._Field
        member this.Foo = this._Foo
        member this.With(?Field:int) = { this with _Field = defaultArg Field this.Field }                                   
        member this.With(?Field:int,?Foo:string) = { this with _Field = defaultArg Field this.Field
                                                               _Foo = defaultArg Foo this.Foo }

Note how we can now use overloading of both Create and With to our advantage.

In this way, we prevent clients from using the record constructor directly, but pay the cost of re-implementing most of the useful functionality for record types ourselves. It’s not much less work than basically doing the same with a class type.

Can F# v_next solve this?

The  problem is that the syntax is deceiving: it looks like the order of arguments does not matter – both in the definition syntax, and in construction syntax. Also, if you use ‘with’ it looks like the call is resilient to adding a new field. And in fact, it is as long as you recompile – which makes the problem worse in some sense as a programmer will see her intuitions confirmed with every compilation.

Given that the syntax is pretty much set in stone at this point, I don’t see a good way around it. If you want to keep the illusion of order-independence of the fields, one thing to do is make the compiler generate some kind of discovery phase at runtime, which would incur an unreasonable performance cost. Another approach would be to have Set method per field that returns a new record instance, and make the constructor private – but then setting many fields would involve as many object creations – again unreasonable for performance reasons.

In fact, I don’t think this is such a big problem at all, as 95% of users will probably not encounter it, and for them the abstraction is valid. What v next could address though, is a way for the other 5% to control backwards compatibility of record types better.

A first idea would be to allow the declaration of “optional fields”, and compile those as overloaded constructors – say you could write:

type MyRecord =
    { Field : int
      SecondField : string
      ?OptionalField = 5 }

which would be compiled to:

type MyRecordTranslated(field:int,secondfield:string, optionalField:int) =
        new(field:int,secondfield:string) = MyRecordTranslated(field,secondfield, 5)
        member this.Field = field
        member this.SecondField = secondfield
        member this.OptionalField = optionalField

I.e. make an overloaded constructor per optional field, and the F# compiler can enforce that optional fields should always be last in the definition. This shouldn’t be too much of a surprise, as the same restriction holds for optional arguments, and also helps somewhat to counter the wrong intuition that order of record fields does not matter. This would at least allow people to add new optional fields without breaking backwards compatibility.

Another option is to explicitly allow overloaded constructors – similar to the implicit class definition syntax, it could be enforced that all the overloads call into the same constructor. Syntax may be a bit of a pain, but I’m sure something can be worked out.

Finally, it would be good to be able to control the visibility of the constructor and the getters separately. In fact it would be nice for other reasons too to control accessibility of the getter for each field separately anyway. internal/private could be allowed in front of the field definitions to control visibility of the getter, while the visibility in front of the curly brace, as now, would only influence visibility of the constructor.

In retrospect, I think there’s something to be said for having the compiler emit a warning or even error when constructing a record type with the fields in a different order from the record type definition, if only to counter the wrong intuition.

I believe the best option here is the second one, allowing overloaded constructors. 3 and 4 are not backwards compatible as far as the F# compiler is concerned, and although the optional fields are nice, this is probably best left as some syntactic sugar over real overloaded constructors as the latter are more flexible.

Conclusion

Binary compatibility may not be a big issue for you. It’s certainly not an issue if you don’t expose any programmatic API as part of your F# projects. In that case, live happily ever after.

On the other hand, you may want to think about giving yourself some flexibility in keeping your API backwards compatible. In that case, hopefully this post has given you some tools to come up with an appropriate strategy. Note that if you can reasonably expect that your clients will recompile whenever you release a new version, the whole problem is moot too.

Overall many F# programmers don’t need to consider this at all. However it deserves a bit more attention than it’s getting, and might catch some people unaware (it certainly caught me out at some point...).

Share this post : MSDN! Technet! Del.icio.us! Digg! Dotnetkicks! Reddit! Technorati!

9 comments:

  1. Great article, I know I should probably give this some more attention.

    For getters I'd use lenses, they're composable and you get updates for free. In fact, lenses isolate the client from the actual data representation (it could be a record, a class a tuple or even a dictionary)

    ReplyDelete
  2. This is a great idea, I would love to see optional/default values on record fields.

    ReplyDelete
  3. Nice article.

    You could solve most of these issues by adding a Create method or Default property to you record:

    type MyRecord =
    { Field : int
    SecondField : string }
    with
    member __.Default =
    { Field = 0
    SecondField = "" }

    Clients can then create a new record like:

    let instance =
    { MyRecord.Default with Field = 2 }

    This has similar advantages to using a class as you describe, without some of the drawbacks. But of course, the big problem with this is theirs currently no way to force clients to always pass by the default property, as there's no to stop clients creating instances of your record directly (or at least I don't think there is).

    ReplyDelete
  4. I do think changing F# here is overkill, it is already over-complicated. Note that exposing data representation (records or unions or class fields) in public API is bad practice anyway, with the same problems as you describe. Wherever it does make sense to do that, as in exposing configuration records for some algorithm options, I do like the Default field or the constructor solutions mentioned above.

    ReplyDelete
  5. Robert: no no noooo! :) The approach with Default does not work, because 'with' syntax is compiled to the record constructor again. Try defining your record type in assembly A, and calling it from assembly B like you did, using 'with'. Then change the order of the fields in the definition in A. Now run B without recompiling. You'll get a runtime error.

    ReplyDelete
  6. toyvo: indeed, the F# component design guidelines say "Do hide the representations of record and union types if the design of these types is likely to evolve". However, it does not go into much details as to why exactly, and you do give up some niceties (like 'with' syntax) in the process. Thought it was useful to write that up.

    I also think this reeks of "we don't currently support a way out of this situation so let's call it bad practice". A record type's field are not necessarily implementation details. If we would have something like overloaded constructors, using them e.g. for configuration options and such would make perfect sense. As it stands, the only other option is to re-code using class types which is tedious and verbose.

    Note that union types suffer somewhat less from this problem, as as long as you don't change the existing union cases, you're alright.

    ReplyDelete
  7. Mauricio: nice idea, but you do end up paying a performance penalty, i.e. using 'with'
    syntax or the With method in my example, you create one new record for each invocation no matter how many fields you set. Using lenses, you'd create n-1 useless intermediate records if you "set" n fields.

    ReplyDelete
  8. @Kurt, interesting discussion.

    Concerning unions: if you want binary compatib, you raise the bar very high.. Adding a union case will break it, same problem you point out "with" construct has. Also adding a union case you *want* to recompile code that uses it, to check that pattern-matching is complete.

    Overloaded constructors makes me think of overloaded static "Create" methods on the record with a private representation, another viable design in today's F#.

    ReplyDelete
  9. Re unions: adding a union case does not break binary compatibility per se, unless I'm missing something. Each case is compiled to a separate subclass. You can't change an existing case, of course. And if you return a new case to a client that doesn't know about it too - the point is, you can theoretically keep that under control.

    Agree that binary compatibility is hard - I don't advocate changing the language in such a way that it is guaranteed no matter what you do. But record types are a bit concerning at this point, in the sense that F# in effect makes it impossible to change without forcing clients to recompile. The programmer doesn't have the necessary control. You may be right though that it doesn't warrant any extra complexity in the language.

    Overloaded static Create methods work, but you need a separate mechanism to enforce that clients only create records using the Create methods...

    ReplyDelete