I could not post this here…

so I decided to post it there.

Gosu’s Secret Sauce: The Open Type System

Earlier last week we finally released Gosu to the general public, the JVM language we use for configuring our enterprise applications at Guidewire.   Gosu has been in the works for several years here beginning at the dawn of our engineering history in 2002.  Over the years the language has evolved to meet the growing demands of our applications.  Initially we created Gosu because we needed an embeddable, statically typed, scripting language for business rules.  We needed static typing so that we could 1) statically verify business logic and 2) build deterministic tooling around the language via static analysis e.g., for code completion, navigation, usage searching, refactoring, etc.  Tooling was, and is now more than ever, a vital part of our technology offering.  For instance, we don’t want to burden our users with having to commit our entire API to memory in order to be productive with our platform.  But it doesn’t end there.

Our platform, like that of most large-scale enterprise application vendors, consists of a relatively tall stack of technologies, most are developed in-house.  The stack is comprised of the following subsystems in no particular order: Database Entity/OR layer, Web UI framework, Business Rules, Security/Permissions, Web Services, Localization, Naming services, Java API, Query model, XSD/XML object models, various application-dependent models of data, and Gosu.  A primary challenge with any enterprise application is to adapt to customer requirements through configuration of its subsystems.  As I’ve stated, Gosu is the language we use to accomplish that, but how does, say, the Gosu code in the Web UI or Rules access the Entity layer and XSD-modeled information?  In other words, if a primary goal of Gosu’s static type system is to help users confidently and quickly configure our applications, how can it possibly help with domains such as these which are external to the type system?

This is where Gosu separates from the pack.  Unlike most (all?) other programming languages, Gosu’s type system is not limited to a single meta-type or type domain.  For instance, Java provides one, and only one, meta-type, namely java.lang.Class.  If you want to directly represent some other type domain to Java, you’re out of luck.  Your only options are to write a class library to access information about the domain or resort to code generation.  Neither option is desirable or even adequate in many cases, but as Java programmers we don’t know any better.  Dynamic languages such as Ruby provide varying degrees of meta-programming, which can be a powerful way of dynamically representing other type domains.  But with dynamic  languages we escape type safety, deterministic tooling, and other advantages of static typing, which defeats the purpose of our primary goal of exposing other type domains to the language.  Our customers, for instance, would be at a terrible disadvantage without the ability to statically verify usages of various domains in their application code.  Thus, Gosu’s type system must provide a similar (or better) level of flexibility provided by dynamic meta-programming, but without sacrificing static typing — a tall order.  How did we do it?

Gosu’s type system consists of a configurable number of type domains.  Each domain provides a type loader as an instance of Gosu’s ITypeLoader interface.  A type loader’s primary responsibility is to resolve a type name in its domain and return an implementation of the IType interface.  This is what is most unique about Gosu — it’s type system is open to other domains to participate with first-class representation.  For instance, Gosu does not discriminate between a Gosu Class a Java Class or Entity Type or XSD Type or what have you; they’re all just types to Gosu’s compiler.  This unique and powerful feature extends all the benefits of static typing to a potentially huge range of otherwise inaccessible domains.

Before we consider a specific domain, it is important to understand that not all type loaders in Gosu are required to produce bytecode.  Those that do, like Gosu classes, implement an interface for getting at the corresponding Java bytecode. Those that don’t provide call handlers and property accessors for reflective MethodInfo and PropertyInfo evaluation.  Note all types provide TypeInfo, see IType.getTypeInfo().  For instance, the parser works against the TypeInfo, MethodInfo, etc. abstractions as the means for a level playing field between disparate types. At runtime, however, unless a type provides a Java bytecode class the MethodInfos and PropertyInfos also are responsible for handling calls.  Not requiring bytecode at runtime accommodates a potentially much wider range of type loaders and makes it possible  to quickly prototype loaders. 

With that understanding, we can jump to some specific examples.  Let’s take a look the Gosu Class type loader. It’s job is to resolve names that correspond to the domain of Gosu classes, interfaces, enumerations, enhancements, and templates (primarily .gs* family of files).  The result is an instance of GosuClass, which implements IType etc.  One of the methods in there is getBackingClass().  Essentially, this method cooperates with Gosu’s internal Java classloader to compile the parse tree from the Gosu class to a conventional Java class.  So it follows that Gosu’s compiler performs dynamic compilation direct from source; there is no separate build process or persisted classfiles. This does not imply that the Java class loader is transient, however.  It is not.  Since Gosu is a statically typed language it’s classes are compiled to conventional bytecode and, thus, don’t require any classloader shenanigans as with most dynamically typed languages.  But I digress.

What connects the Java bytecode class to its Gosu type is the IGosuObject implementation; all Gosu classes and most other types implicitly implement this. The getIntrinsicType() method is instrumental, for example, in making reified generics work in Gosu.  An instance of a Gosu class, or any type in the entire type system for that matter, can produce its underlying type via getIntrinsicType().  And from there Gosu’s type system takes over. For instance, you can reflectively access Gosu-level type information from Java code via TypeSystem.getFromObject( anyObject ).

It’s worth repeating that Gosu classes are no more important to Gosu’s type system than any other type.  For instance, XSD types resolve and load in the same manner as Gosu classes.  Given an XSD type loader (we have one internally), Gosu code can import and reference any element type defined in an XSD file by name.  No code generation, no dynamic meta-programming, and all with static name and feature resolution.  The XSD type defines constructors, properties, and methods corresponding with elements from the schema.  In addition it provides static methods for parsing XML text, files, readers, etc.  The result is an instance of the statically named XSD type corresponding with the element name, which can be used anywhere a Gosu type can be used, because it *is* a Gosu type!  Here’s a very simple example that demonstrates the use of an XSD type:

From a sample driver.xsd file:

<xs:element name=”DriverInfo”>
   <xs:element ref=”DriversLicense” minOccurs=”0″ maxOccurs=”unbounded”/>
   <xs:element name=”PurposeUse” type=”String” minOccurs=”0″/>
   <xs:element name=”PermissionInd” type=”String” minOccurs=”0″/>
   <xs:element name=”OperatorAtFaultInd” type=”String” minOccurs=”0″/>
  <xs:attribute name=”id” type=”xs:ID” use=”optional”/>
<xs:element name=”DriversLicense”>
   <xs:element name=”DriversLicenseNumber” type=”String”/>
   <xs:element name=”StateProv” type=”String” minOccurs=”0″/>
   <xs:element name=”CountryCd” type=”String” minOccurs=”0″/>
  <xs:attribute name=”id” type=”xs:ID” use=”optional”/>

Sample Gosu code:

uses xsd.driver.DriverInfo
uses xsd.driver.DriversLicense
uses java.util.ArrayList

function makeSampleDriver() : DriverInfo {
  var driver = new DriverInfo(){:PurposeUse = “Truck”}
  driver.DriversLicenses = new ArrayList<DriversLicense>()
  driver.DriversLicenses.add(new DriversLicense(){:CountryCd = “US”, :StateProv = “AL”})
  return driver

Notice the fully qualified names of the XSD types begin with xsd.  That’s just the way the XSD loader exposes the types; a type loader can name them any way it likes, but it should do its best to avoid colliding with other namespaces by using relatively unique names.  Next you may have discovered the name of the XSD file, driver, as the namespace containing all the defined element types.  That’s also a convention the loader chooses to define.  What’s most interesting, however, is how seamlessly the XSD types integrate with Gosu.  You can’t distinguish them from Gosu classes or Java classes, and that’s the whole idea!  All type domains are created equally.  Notice you can even parameterize Java’s ArrayList with the DriversLicense XSD type.  Isn’t that sweet?  And of course all the powerful static analysis tooling applies including code completion, feature usage, refactoring, etc.

Another interesting tidbit: Type domains aren’t limited to tangible resources like files on disk. For instance, one evening I decided to see what it would take to provide a dynamic type in Gosu, similar to C#’s dynamic type.  In theory, I thought,  the type system could handle a dynamic type as just another type domain.  And wouldn’t you know, it could. Well, in the spirit of full disclosure I did have to tweak the type system internals a bit to better handle the “placeholder” concept, but just a little.  Aaanyway, unlike most type domains this one consists of just a single type name, namely dynamic.Dynamic.  Note Gosu requires that a type live in a namespace, so we can’t name it just Dynamic.  It was surprisingly simple to implement the type loader, you can download the source from our website here (http://gosu-lang.org/examples.shtml).  In a future blog I’ll cover the implementation of the Dynamic type loader (or some other one) step-by-step.

Another super cool type loader is the one from the RoninDB project by Gus Prevas (http://code.google.com/p/ronin-gosu/).  Here Gus provides a zero-configuration JDBC O/R layer via Gosu type loader.  The tables are types, the columns are properties, and no code generation; it just works.  Pretty cool, eh?  Download it and give it a spin.

So far I’ve only touched on the basic concepts underlying Gosu’s open type system.  But understanding the core idea is critically important to appreciate Gosu’s full potential.  More so, if you plan to use Gosu in your personal or professional software projects.  Because once you see the light with respect to type domains and the benefits they provide, it’s hard to think of developing enterprise applications without them.  To that end I promise to blog more and shed more light on building type loaders and the type system in general.

Why Gosu?

The most common question we’ve gotten following the release of Gosu as a programming language is pretty simple: Why? Why did you create your own language? Why not use an existing language like Scala, Groovy, Javascript, Clojure, Ruby, Python, C#, or basically anything else at all? Why does the world need yet-another programming language, especially one that doesn’t seem to have any ground-breaking features? Why should anyone care?

There are kind of two sides to the answer. The first is the Guidewire-specific, historical part of the story: we needed a language with particular characteristics to use for configuring our applications (statically typed, dynamically compilable, some metaprogramming capabilities, fairly simple syntax and easy learning-curve for people familiar with Java or other similar imperative languages), and at the time that we started working on Gosu (back in 2002) there wasn’t really anything close to what we needed. We ended up creating it almost accidentally out of necessity.

But of course, that reasoning applies only to us, at Guidewire, and why we need something that we couldn’t find off-the-shelf. Why should you, if you’re someone outside Guidewire, care about Gosu?

As we worked on Gosu, we started to realize that we could actually turn it into a language that we, the language authors, liked, and that there’s currently a vacuum in the programming language landscape for the sort of thing that we were creating: a fairly “simple” (I realize that’s a massively loaded term, but bear with me) language that’s statically typed, dynamically compilable, with some metaprogramming capabilities, syntactic sugar in the most needed places, and with language features like closures, type inference, and enhancements that address some of the most glaring deficiencies and pain points in Java. We want to build something that, at least eventually, is unequivocally better than Java: a language that retains all the strengths of Java but has strictly fewer weaknesses.

Why target Java as the baseline? Because Java is, these days, essentially the lowest common denominator language for a lot of people, especially within the business community. It’s also the first language a lot of people learn in school these days. I have plenty of issues with Java myself, but it does a lot of things right as well, in my opinion: it’s statically typed, that static typing enables a huge array of tools for working with the Java language (i.e. excellent IDEs with refactoring support and error highlight as-you go), it can perform basically as fast as C or C++ for long-enough-running processes, it has garbage collection and a reasonable object-passing model (as compared to, say, worrying about pass-by-reference versus pass-by-value semantics) and a reasonable scoping model, the syntax is familiar enough to most other imperative languages that it’s not too much of a shock to transition from C or C++ or similar, and the lack of “powerful” language features also means that most people’s Java code looks pretty similar, so once you learn to read Java and understand the idioms you’re rarely thrown for a loop (with the glaring exception of generics).

Basically, Java manages to be a lowest common denominator language at this point that, while it’s lack of language features is really annoying, largely manages to avoid any absolute deal-breakers like poor performance, a lack of tools, or an inability to find (or train) people to program in it.

Now, that’s all speculation/opinion on my part as to why Java is where it is currently. You, the reader, are free to disagree with it. If you do agree in large part with that, though, it becomes clear what the imperative is for the next lowest-common-denominator language: it can’t screw up any of the things that Java got right, but it needs to improve on all the places where Java is weak.

So what are the deal-breakers to be avoided? The first is dynamic typing, in my opinion. Saying that is essentially flame-bait, I know, so I’m not trying to say that static typing is strictly better than dynamic typing, but merely that static typing enables certain things that people actually like about Java. Static typing enables all kinds of static verification, of course, which a lot of people find useful, especially on larger projects. When moving people between projects and forcing them to get up to speed on a new code base, for example, static typing can help make it obvious if you’re using a library even remotely correctly, or if your changes are going to clearly break someone else’s code. More important at this point, I think, is that static analysis enables amazing tooling: automatic error-detection as you type, auto-completion that actually lets you usefully understand and explore the code you’re working with (instead of, say, suggesting all functions anywhere with that name), automated refactoring, the ability to quickly navigate through the code base (for example by quickly going to the right definition of the “getName()” function you’re looking at, rather than one of the other 50 functions with that name somewhere in your code base).

Static typing also tends to be an important factor in execution speed, though dynamic languages are catching up there; you might argue that execution speed doesn’t matter anymore, but I’d argue that there’s always the possibility that it might matter in the future even if it doesn’t matter now, so at least in the business world people doing core systems work are often scared away by anything that they think has a performance disadvantage that might, at some point in the future, require them to purchase a bunch more hardware. You might disagree and attribute it to a failure of imagination on my part, but I find it hard to imagine the next LCD language being dynamically typed.

The second deal-breaker is what I’ll call “unfamiliarity.” There are lots and lots of people out there who know Java, or C, or C++, or C#, or even VisualBasic, and they’re so used to the standard imperative language syntax of those languages that something too foreign to that simply isn’t going to fly. It doesn’t matter how good the language is, or how expressive it is, something that doesn’t fit roughly into that mold simply won’t become the next LCD language, at least not any time soon.

The last deal-breaker is what I’ll call “complexity,” another obvious flame-bait term. Everyone’s got a different definition of the term, but here I’m equating it to roughly two related things: first of all, how hard is to fully learn and understand all features of a language, and secondly, for any two programmers A and B how different is their code likely to look/how easy is it for them to write code that the other one doesn’t understand. Again, I’m not trying to start a flame war, and opinions seems to very greatly on the relative complexity of languages, so hopefully we can all at least agree that if a language includes monads, a concept which historically many people have struggled to understand, that ups the complexity bar a fair bit, while a language like Ruby that doesn’t have them is probably a bit “simpler.” Likewise, languages like C and C++ with explicit pointers and memory management are more complex than languages that abstract out those details. Languages like Python where there’s one standard way to do things are also less “complex” by this metric than languages like Perl, where there are multiple ways to do everything, not so much because the individual features of Perl are complex but simply because it’s generally easier for Python Programmer A to read Python Programmer B’s code than it is for Perl Programmer A to read Perl Programmer B’s code. (Again, that doesn’t mean you can’t write readable, awesome code in Perl, I’m just talking about the presumed statistical average difference between two people’s coding styles.)

So to sum that all up: my theory (and I think I speak for the other Gosu designers as well) is that the next LCD language will be statically typed, imperative with a familiar syntax, and will avoid shall we say, “more advanced” language features.

So what can we add to Java to make it better? Well to start with we can add simple type inference with simple rules to make the language less verbose and make static typing impose less of a tax on your code. We can add in simple closures to make data structure manipulation or basic FP coding possible. We can add in enhancement methods so that people can improve APIs that they have to use without resorting to un-discoverable, ugly static util classes. We can add in first-class properties to avoid all the ugly get/set calls. We can add in syntactic sugar around things like creating lists and maps. We can add in dynamic compilation and first-class programs so that the language can scale down to be suitable for scripting. We can simplify the Java generics model and undo the travesty of wildcards. We can kill off the more-or-less failed concept of checked exceptions. And we can add in some metaprogramming capabilities, ideally in a way that’s relatively transparent to the clients of said capabilities so it doesn’t bump up the complexity factor too much.

If we do that, what we’re left with is a language that’s pretty good for most things, without too many (in my opinion, of course) glaring weaknesses: something fast, that allows for good tools, that scales down to small projects or up to big ones, that has enough expressiveness that you only have to write a little code to solve a little problem, that’s easy for programmers familiar with the existing LCD language to pick up, and that has enough metaprogramming capabilities to let you build good frameworks (because good frameworks, in my opinion, require metaprogramming . . . but that’s a different argument).

So that brings us full-circle back to our original question. Why Gosu instead of some existing language? Well, Java has all the flaws everyone knows and is annoyed about. C# is still controlled by Microsoft and isn’t truly cross-platform. C++ is too complex and easy to screw up, which is why so many people moved to Java. Python, Ruby, and Javascript are all dynamically typed and don’t really have good tools around them. (Frameworks and libraries? Definitely. IDEs that let you navigate and refactor? Not so much). Clojure is dynamically typed, lacks good tools, and any kind of Lisp is a bridge way too far for the LCD crowd. At this point any new language that’s not on the JVM or the CLR will have a massive uphill battle in terms of trying to get library adoption. Scala is statically typed and solves a lot of the problems with Java, but it’s also in my opinion a pretty “complex” language that many Java programmers would have a hard time fully understanding or leveraging. Groovy is perhaps the closest thing to what Gosu is, but it’s dynamically typed (with optional type annotations).

So merely in that sense, we think that Gosu fills a vacuum for a statically-typed JVM language that has a familiar syntax, doesn’t add too much complexity to Java, and which improves on Java in the most critical ways.

The last topic, which I haven’t really touched on at all, is around the metaprogramming allowed by Gosu’s type system. That’s worth another blog post simply in itself; the short version is that since the type system in Gosu is pluggable, that types can be generated or modified “on the fly” (i.e. really up front at parse time) to let you do things like take a WSDL and turn it into usable types that can be invoked just like any normal Gosu or Java class, but without having to do code generation. It’s the sort of thing that dynamic languages are incredibly good at but which, in a statically-typed language, has historically required reams of ugly code generation. There are other neat framework tricks you can do given that more runtime type information is available in Gosu than in Java or most other statically-typed languages. That’s what we really think will emerge, over time, as the killer feature of Gosu. For now, though, that’s less apparent, because it will only become a killer feature if people leverage it to create frameworks that other people want to use; it’s not the sort of thing you yourself will want to use every day, it’s something that you want other people to use when building the libraries and frameworks that you use every day.

So there you have it. That’s my (overly-verbose, as usual) explanation for why we think Gosu has a place in a world that already has so many existing language options; you most certainly don’t have to agree with my reasoning or my arguments, but hopefully it’s at least now clear what they are and what we’re trying to do. We’re not trying to push the language envelope in new directions, or to come out with language features no one’s ever thought of before; we’re trying to pick from the best ideas already out there, wrap them up in a pragmatic language that we’ve tried to keep simple and easy to pick up, and create something that will be appealing and useful to the very large number of programmers in this world who just want something relatively familiar that makes their programs easier to write and maintain without having to give up too many of the things that they’re already used to and have come to rely on.

Gosu 0.7.0 Is Now Available

The title pretty much sums it up; the first publicly available version of the Gosu programming language is available for download at the main language site, http://gosu-lang.org/. We’re all pretty excited about this, so we’d love for everyone to try it out and let us know what you think. If you have questions you can contact us via the Gosu-lang Google group and you can report any bugs you find via our Google code group. Please, go give it a whirl!

My Ignite Silicon Valley 2 Talk on How to be Wrong

Last night I gave a talk at the Ignite Silicon Valley 2 event down at Hacker Dojo in Mountain View. Ignite is an interesting idea: everyone presenting gets exactly 5 minutes to deliver exactly 20 slides, and the slides are set to auto-advance every 15 seconds. It definitely keeps the talks moving along, but it certainly makes ad-libbing a lot harder, so I ended up writing out my entire talk and then more or less memorizing it. The talks were recorded, but unfortunately the first half of the talks (which included mine) had no audio. So instead I’ve posted the text of the actual talk that I wrote, which matches what I delivered pretty closely.

Here goes:

If there’s one thing I want everyone listening tonight to come away with, it’s this idea: correctness is a collaborative effort, not an individual one. Being seen to be right by others is not the same as actually getting the right answer, and it’s almost always the right answer that’s the important thing.

I’m an engineer by profession, and this talk is most applicable to endeavors like science and engineering, but many of the ideas apply equally well to other areas, like relationships where the same truth holds that the “right” answer is most appropriately found through a collective effort.

Most of us, naturally, spend our lives trying to be right, and we hold our beliefs and opinions because we think they’re correct, not because we’ve chosen them randomly or capriciously. But everyone does this, even people we vehemently disagree with. They’re just as convinced that they’re right as we are.

When we’re trying to solve a problem or answer a question, then, we tend to have two subtly different options: we can try to convince everyone else that we’re right, or we can present our arguments and opinions in such a way that we try to move the group as a whole towards finding the correct answer, even if it’s not the one we ourselves proposed.

The first option, and what most of us do instinctively, is to try to convince other people that we’re right. I call that approach “being seen to be right,” because it has nothing to do with actual correctness, and everything do with other people’s opinions. Trying to be seen to be right often involves tactics such as rhetorical tricks to make things sound better than they are.

Those include strategies like appeals to emotion, false analogies, misrepresentations of alternative positions, logical fallacies such as appeals to authority or ad hominem attacks on people holding other viewpoints, or simply trying to win an argument by sheer force of personality.

Truly committing to trying to find the right answer, in contrast, requires putting aside our egos, recognizing our own fallibility and biases, making our arguments as clearly as possible, and opening ourselves up to the possibility of being wrong.

We can often learn as much from an incorrect answer as we can from a correct one; science is advanced just as much by proving hypotheses incorrect as it is by experiments that confirm what we already think to be true. A few tips, then, on how to work towards the right answer through a collaborative effort.

Tip #1 is to avoid rhetoric. Avoid appeals to emotion, false analogies, and clever soundbites that oversimplify complex problems. Avoid logical fallacies like appeals to authority or ad hominem attacks.

Tip #2: Make it easy to for someone else to pinpoint exactly where they disagree with you. Doing that involves clearly identifying your assumptions and facts as well as your reasoning and how you logically proceed from those assumptions and facts through to your conclusion.

Philosophy papers will often go to the length of giving numbers to individual statements that are supposed to follow logically from one another, so that if you disagree you can clearly state that you think assumption 2 is incorrect, or that point 3 doesn’t follow from 1 and 2. If someone disagrees with you, you want them to be able to pinpoint the exact points of disagreement, rather than just saying “I think you’re wrong.”

Tip #3: Be honest if you’re unsure about something. If you think an argument is weak, or if you’re not sure about a fact, say so. Doing so helps highlight the issues that are most in need of further discussion or enlightenment.

Tip #4: Anticipate and reason through criticisms of your argument and potential counter-proposals. Do it as objectively and as fairly as you can, and don’t gloss over them. Try to break your own argument to find where it’s weak, and legitimately try to adopt contrary viewpoints.

Tip #5: This is perhaps one of the hardest ones to do, but be willing to change your mind. You may find that someone else makes a convincing counter-argument, or you may find that if you honestly do your best to consider alternative viewpoints you like one of them better than your original opinion.

On a personal note, one of the hardest things for me to learn to do as a philosophy undergraduate was to throw out a paper that was 80% written when I found a counter-argument that I simply couldn’t refute. In that case I’d simply have to start over, building an argument that was completely counter to my original position.

Tip #6: Don’t bully people. Especially if you’re someone who’s used to being right, or someone in a position of authority or respect, it can be easy to steamroll people unintentionally. If people are overly-deferential to your opinions, you should do everything you can to make sure they feel like they can disagree with you.

Tip #7: Don’t take it personally. Everyone is wrong at times, and the smartest scientists, philosophers, or engineers you can think of have been all been incredibly mistaken about very fundamental things. It happens. Being wrong doesn’t mean that there’s something wrong with you.

Tip #8: Avoid a culture of blame. If the overall culture of a group or organization is one where people are punished or blamed for any small kind of incorrectness, it will encourage people to pursue being seen to be right at the expense of actual correctness.

Tip #9: You can’t win if you don’t play. Even if you’re not 100% sure you’re right, if you’ve got on opinion about something be willing to put it out there. You can still make a valuable contribution to a discussion without having the ultimate right answer.

Tip #10: Keep your eyes on the prize. The prize is almost certainly not simply having people think you’re smart. It’s more likely something like the overall advancement of human knowledge, the proper functioning of some system, or merely a happy relationship. Whatever it is, it’s a goal that’s best achieved collectively.