Pragmatic Type Systems

What do static type systems provide and what do they cost? It’s tough to know what to make of the dynamic vs. static typing arguments going on right now on sites like Artima without having solid answers to these two questions.

The Benefits

Static type systems have been developed mainly by the academic community, and they focus on soundness and provability. This is understandable: academic computer science is largely a subset of mathematics, and so adopting the rigor of the mathematical community is natural. This rigor leads to the conclusion that the primary goal and benefit of a static type system is that it provides (or *should* provide) proof that your program functions as you expect.

A less extreme version of this view holds that the main benefit of a static type system is that, even if it doesn’t prove your program is correct, it will help you catch errors at compile time rather than run time. The common rejoinder to this argument is that we should all be writing tests anyway, and the tests will catch the errors for us rather than the compiler. That’s true to a point, but it is nice to have the compiler point out exactly where the trouble spots are after you’ve merged in someone else’s changes to your local machine. Running a suite of tests and sifting through the output can be tedious and slow, and compiler errors tend to be clearer and more local to the actual error than test failures. Sure, the errors compilers catch are often trivial, but they find them quickly and easily.

A third, and very different answer is that static type systems make it easier to provide excellent tool support. Things like code-completion, code analysis and refactor tools. Dynamic languages can provide some low-rent versions of these features, but they are either complicated (smalltalk) or poor implementations (ruby IDEs.)

This third benefit is easily the most compelling. IDE’s and other tools help average, fallible humans get code out the door and clean up after themselves on a day-to-day basis. It’s more important that the editor tell me what I can do with a variable than it verify what I’ve done is correct after the fact. The tools provide discoverability and allow developers to leverage up their memory banks, with the tools themselves remembering all the details about types. Once you really start ripping through code with a good IDE like IntelliJ, it’s hard to think about coding without it, and it’s hard to think about how to write an IDE as good as IntelliJ without static typing.

The Costs

The most obvious cost of a static type system is syntactic: you simply have to write more code since you have to declare the types of things. Java is a worst-case example of this: you end up writing types out over and over again on the left and right hand side of assignments.

A second cost of static type systems is conceptual complexity. A good example of this is java’s generics implementation. Josh Bloch made headlines a while ago[1] when he came out against his longtime friend Neal Gafter’s closures proposal. In his talk he cited wildcards as the primary source of complexity in java’s generics. (Note that wildcards were added to preserve type-safety: they aren’t generally of much use to tools.)

Type inference, which helps with the syntactic cost of static typing, can also add to the conceptual complexity if the inference is too aggressive: it can cause incomprehensible, difficult to resolve compilation errors. You can see this in java occasionally with generic method inference. If you’ve ever used an ML-derived language for a non-trivial project, you’ve probably seen this as well.

The GScript Approach: A Pragmatic Type-System

GScript aims to maximize the benefit of static typing while minimizing the costs. Since there is no mathematical formula tying these two opposing goals together, we simply had to go with our guts. (Not that there is no guidance available. Scott McKinney, the creator of GScript, often cites a paper[2] from Microsoft as a source of inspiration when making these tradeoffs. He also follows Anders Hejlsberg’s work on C# 3.0, a very pragmatic language.)

Below are two examples of pragmatic decisions we’ve made with respect to GScript’s type system. Both involve tradeoffs that we feel maximize the benefits of the static type system while minimizing the burden of it.

Pragmatic Type Inference

GScript provides local type-inference:

  var myList = new ArrayList()

You do not need to declare that myList is a List, the GScript compiler will determine that for you. This is especially nice with generics:

  var myMapOfLists = new HashMap<String, List<String>>()

However, unlike some ML-derived languages, GScript does not infer the types of function arguments or return types:

  function joinList( lst : List ) : String {
    var str = new StringBuilder()
    for( elt in lst index i ) {
      if( i > 0 ) str.append( ", " )
      str.append( elt )
    }
    return str.toString()
  }

Note that both the parameter type, List, and return type, String, are annotated.

This is conceptually simpler for both the compiler writer and the GScript coder, since all type information is local to the function. It also allows for a decoupling of types that I will discuss in a later article.

So GScript steers the middle course between no type-inference (java) and total type-inference (ML-like languages.)

Pragmatic Generics

As you can tell from the code above, GScript, like java 1.5 and greater, has a generics implementation, where you can declare a list composed of a certain type:

  var listOfStrings = new ArrayList<String>()

GScript does not have wildcards, however. Instead, we adopted covariance of generic types, so a List<String> is assignable to a List<Object>. This hugely simplifies generics for the end user: generic classes behave just like arrays, which everyone is used to. We do sacrifice some type safety and a bit of expressiveness (see my note on the Microsoft paper below) but we don’t sacrifice anything with respect to tools, and we lop off a huge chunk of complexity.

Static Typing with a Light Touch

With the two features outlined above, GScript manages to have a reasonably type safe static type system that supports tools quite well, while having a relatively small syntactic and conceptual overhead. With this light touch it competes well syntactically and conceptually with dynamic languages such as ruby or python for average, day-to-day development. The type system, for the most part, gets out of the way of the developers except when they need it.

So the costs of a pragmatic static type system are pretty low and the benefits are pretty high, as long as you have access to a language like GScript.

We’re working on that.


[1] – The Closures Controversy – Chapters 12-16
[2] – Static Typing Where Possible, Dynamic Typing When Needed:
The End of the Cold War Between Programming Languages

One interesting note on this paper is that that the authors want two incompatible features: they mention an IComparer interface as their example of generics, but then advocate covariance of generic types. Unfortunately, IComparer is a contravariant type! That is, you can safely assign an IComparer<Object> to a variable of type IComparor<String>, but you can’t safely assign an IComparor<String> to a variable of type IComparor<Object>.

Tradeoffs, tradeoffs, tradeoffs.


2 Comments on “Pragmatic Type Systems”

  1. Ovid says:

    Just a quick comment on “The most obvious cost of a static type system is syntactic: you simply have to write more code since you have to declare the types of things.”

    This isn’t true. You’re referring to something referred to as “type annotations”, “manifest typing” or “explicit typing”. However, many static languages (ML or Haskell, for example) use type inference, thus allowing annotations to be dropped. So you could (pseudocode) say “foo = 3”, and the variable “foo” becomes typed as an integer, because the type inference algorithm knows that “3” is an integer.

  2. Carson Gross says:

    Ovid,

    You are right, of course. I was being sloppy with my language. What I should have said is “The most obvious cost of a static type system in most statically typed langauges is syntactic.” And even that probably isn’t precise enough, maybe I should have said “in java, C++, C, and, before 3.0, C#”.

    GScript is statically typed and does away with a lot of type annotations although it doesn’t go as far as either ML, Haskell, or Scala does so that we can support other features (namely, on-demand lazy compilation of classes where the demand is deferred as far as possible.)

    Cheers,
    Carson


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s