Another polymorphism blind spot

Yesterday I detailed a simple fragment of code that is untyped in traditional Hindley-Milner type systems (such as that employed by Elm), but which is a perfectly reasonable bit of code. I think of this as a minor irritation and quite easy to live with. I term it a blind-spot of polymorphism, because in general I find that polymorphic Hindley-Milner style type systems map pretty well on to the types of programs that I wish to write. Aside from the issue of meta-programming, but that's quite a separate issue.

Anyway, today there is another blind-spot and this is more than a minor-irritation, although it doesn't come up all that often.

Suppose you're writing a code editor, you can have different kinds of buffer representing different kinds of source code. Each different kind has its own mode. So let's suppose you have an Elm kind of buffer and a Json type of buffer, and those has related states, and also related modes. How do you represent the list of open buffers?

So first we might define an 'Elm buffer' and a 'Json buffer':

1type alias ElmBuffer =
2    { text : String
3    , definedNames : List String
4    }
5
6type alias JsonBuffer =
7    { text : String
8    , indentation : Int
9    }

Then the modes, obviously there would be a lot more

1type alias Mode a  =
2    { format : a -> a
3    , highlight : a -> [( Int, Color) ]
4    }
5type alias ElmMode =
6    Mode ElmBuffer
7
8type alias JsonMode =
9    Mode JsonMode

So we would like to represent the list of open buffers as a list of buffers and their associated modes. But you cannot do this, even though you would like to be able to write the type:

1type Buffers =
2    List (a, Mode a)

Unfortunately if you try this, you will have the problem that Elm does not allow a 'free' type variable. Note that there is a missing type variable between Buffers and =, so you have to write:

1-- type Buffers =
2+ type Buffers a =
3    List (a, Mode a)

But this of course means that all the buffers must be of the same type. So I cannot write a function such:

1formatBuffers : List (a, Mode a) -> List (a, Mode a)
2formatBuffers buffers =
3    let
4        formatBuffer (b, m) =
5            m.format b
6    in
7    List.map formatBuffer buffers

The Hindley-Milner system can type this, but the input must be a list of uniform buffers. Even though we can see that giving a list of heterogenous buffers would be perfectly fine, because we only ever apply a mode's format function to its own buffer.

Now, you can get around this, but it has a significant drawback:

1type Buffer =
2    Elm ElmBuffer
3    Json JsonBuffer

Then your formatBuffer becomes easy enough:

 1formatBuffers : List Buffer -> List Buffer
 2formatBuffers buffers =
 3    let
 4        formatBuffer buffer =
 5            case buffer of
 6                Elm elmBuffer ->
 7                    Elm (ElmMode.format elmBuffer)
 8                Json jsonBuffer ->
 9                    Json (JsonMode.format jsonBuffer)
10    in
11    List.map formatBuffer buffers

Notice here that we just represent the mode with a module, because we know longer have to carry the 'mode' information around with us. And that's the major problem. This system is not extensible. If you release the editor as a program that's perhaps expected, but if you release it as a library even then it's not possible for the library consumer to add their own mode. They would have to fork the library (or contribute back to the original one).

So the blind-spot is that you cannot enclose a parcel of data that is internally self-consistent and prevent some of the type being leaked into the broader value. So more generally you cannot write the following:

1zipApply : List (a, a -> Int) -> List Int
2zipApply pairs =
3    List.map (\(f,a) -> f a) pairs
4
5numbers = zipApply [ (1, identity), (2.0, round) ]

Even though, we know there is nothing wrong with it.

#programming # #elm

last updated: 2022-08-23