Sorting A List


Note:I’ve rewritten this post based on feedback from Neal and Ted in the comments. I’ve left the inflammatory comments in (this is a blog, after all), but tried to do a more apples-to-apples comparison between GScript and java. The original article can be viewed here.


My first post showing some GScript on this blog compared sorting in Java with sorting in GScript. Today I spent a lot of time going over the psychosis of Comparators in java, and it got me thinking again about how differently Java and GScript approach the problem of sorting lists. I’ll split the discussion into two parts: the code developers must write to sort a list and the actual signatures of the sorting methods used.

The “Client Side”

Comparators, as everyone knows, are java’s way of defining orderings for collections of objects. In Java 1.5 and greater, they are parameterized on T: if you want to sort a List<T>, you pass in a Comparator<T>. Sort of. We’ll get to that. First, let’s look at a simple (!!!) example of sorting in java:

   List>Employees> someEmployees = getEmployees();
   Collections.sort( someEmployees, new Comparator<Employee>(){
    public int compare( Employee e1, Employee e2 ) {
      return e1.getSalary().compareTo( e2.getSalary() );
    }
  });
  return someEmployees

What that code does is sort a collection of employees by Salary. You are forgiven if you can’t tease that fact out of that mess.

As I wrote in the previous post, the GScript to accomplish the same task is:

  return Employees.sortBy( \ e -> e.Salary )

GScript uses closures to boil the operation down to a single line of code and type inference to minimize the syntax of that line. What do we want to do? Sort the list by something. What is that something? The employees Salary. We know that Employees is a list of Employees, so why should we have to specify the type of the “e” parameter to the closure? A GScripter can let the compiler take care of a lot of the details, and still gets nice code completion since everything is statically typed.

The “Server Side”

Now let’s take a look at the other side of the fence: the signatures of the methods used for sorting.

The signature of Collections#sort(list, c) is:

static <T> void sort(List<T> list, Comparator<? super T> c)

What the hell does that mean? Basically, it means you can pass in any comparator that is parameterized on T or a supertype of T. This is because the comparator is going to be invoked with objects of type T, so only comparators that take that type or *higher* in the inheritance chain will work. In particular, Comparator<? extends T> will *not* work, because the comparator might be expecting a subtype of T. This is a case of contra-variance, which is fairly rare in type-systems. We are used to the opposite, variance, where a subtype of T is acceptable in place of T. Some programming languages, notably Scala, allow you to annotate type variables with the particular variance they allow, usually using a ‘+’ or ‘-‘ sign.

(As a side note, as I’ve said elsewhere, any non-trivial application of generics requires me to stop and put on my generics hat, and, ten minutes after I’ve understood and solved the task at hand, *poof*, it’s gone.)

Interestingly, since java lacks any way to specify variance, internally Sun engineers have to cast the Comparators to the generic type to do comparisons. I’m not lying. Check out the implementation. The implementation of TreeSet is even better, with the developer putting in little /*-*/ comments in the places to indicate variance more correctly when he has to cast to the generic type.

So here we have a lot of syntax flying around to make sure sorting lists is absolutely, positively, 100% type safe. And, despite all that, the sun engineers still have to crosscast to the generic type to make the damned thing work. All to prevent what has to be one of the rarest programming errors I can imagine: accidentally sorting a List with a bad Comparator. I’ve never seen that happen. I’ve never heard of that happening. I’ve never even heard it mentioned that it might happen. And yet all that complexity has been thrown at this problem in java.

The signature of sortBy() is:

  function sortBy( value(T):Comparable ) : List

and it is injected onto List via GScript Enhancements.

The parameter is declared to be a block that takes an argument of type T and returns a Comparable object, which will be used to order the list. (We eventually delegate to Collections.sort() internally so that the sort occurs in java, rather than GScript. Hey, we still want it to be fast, right?)

Note that Comparable is left generic because there is almost zero chance someone will screw a call to sortBy() up from a typing perspective, and if they did, the incomprehensible generics error message would be far harder to understand than then inevitable runtime exception that would occur. It just isn’t worth the complexity to specify it.

Yeah, so?

Well, to me, GScript throws complexity at the right part of the problem. It takes a common operation on Lists, sorting by some attribute of their component elements, and boils it down to its essence using closures. Closures, like generics, are somewhat complicated and can take a bit to get your head around. But once you grock closures in GScript you end up having to know less, not more, about the complexities of sorting. This is in stark contrast with the generics in Collections.sort(). That application of complexity doesn’t help developers much, if at all. The only time you will ever notice it is when it gets in your way.

The overriding theme of GScript, formally spelled out by Scott at the start of our Bedrock release: “GScript features are about making developer’s lives easier.” I’m convinced that if static language designers make pragmatic concessions to real-world developer needs and the acknowledge the practical limitations of static typing, the current excitement around dynamic languages would temper dramatically.


Software Is Not A Term Paper

I’ve been thinking a lot lately, and having lots of discussions, about what our development process should be for the next release of PolicyCenter. I’ve taken the potentially-controversial position that date targets and deadlines are detrimental to developing a software project, and it’s a position that I feel many non-developers don’t really understand.

The intuitive position that I’m arguing against is what I think of as the term paper philosophy; having the deadline there forces you to get it together to crank through the thing, whereas without any firm deadline you’d keep working on it forever (or keep putting off working on it), and moreover you wouldn’t really work very hard on it either. Some people need that deadline pressure and the rush of adrenaline that accompanies the fear of not getting things done, but even people that don’t thrive on it often benefit from having a deadline to focus their efforts.

What I’m saying is that that theory doesn’t apply very well to long-lived software projects. It might apply to toy projects in college that you never work on again, but the two fundamental differences between software and a term paper (or nearly anything else you try to build on a deadline) are that you keep having to work on the software project and there are far more corners to cut in software development.

It’s hard to over-emphasize the significance of that statement, and I think that non-developers really don’t have a very good analog of what that means. There just aren’t that many other fields or endeavors in which something is built up successively over years or even decades of work. There are even fewer where you can release an intermediate product off of that tree periodically that people will actually use (and expect support and upgrades for), and fewer still where you can “successfully” cut as many corners as you can in software development without having things completely fall apart in the short term. The only analogy I can think of might be construction work, where if you rush the foundations of a building then the future stories won’t fare so well, but it might not be obvious that’s the case until you start on those future stories or until an earthquake or some other disaster hits. But most sorts of projects or tasks that people do are kind of one-time things; if you keep using them, you use them as they were when they were finished rather than attempting to build more and bigger things on top of your initial work. If you rush building a chair, maybe the chair’s a little off-balance, but it’s not going to affect future chairs that you build. And of course, that’s all combined with the fact that software estimation for anything other than the immediate task at hand is basically impossible, meaning it’s hard to avoid deadline pressure by just estimating accurately, since you’re always trying to estimate something that’s fundamentally pretty unknowable.

What sorts of corners can you cut that will make the product look “done” but will come back to haunt you later? Here are a few examples:

  • No or incomplete tests
  • Ignoring or not considering edge cases or error conditions
  • Ignoring or not considering feature interactions
  • Poor UI design
  • Poor datamodelling
  • Poor API design
  • Improper encapsulation, decomposition, or otherwise messy code
  • Inconsistency between different areas of the code
  • No, poor, or inaccurate documentation and specs

All of those things will inevitably bite you later down the line; some of them will bite your customers too, some of them will make future changes nearly impossible or far more difficult, some will cause your development effort to grind to a halt in the future, and others will be fixable but just require more work to do later than they would have to do right initially. Nearly all of them will eventually slow down development, putting even more pressure on future deadlines and leading to a vicious cycle where even more short-term hacks enter the system.

So let’s imagine a real-world scenario. You’ve got what you think is 3 months worth of work to do, but one month in less than 1/3 of it (as far as you can tell) is done. What do you think will happen? People will feel behind, and they’ll pull whatever strings they can to try to try to make up the lost time, either consciously or unconsciously. The same thing will happen with more short-term deadlines if you ask someone to try to get something done by the end of the week: they’ll find a way to try to make it happen, even if they should really take more time. In the long run, you’re going to pay dearly for those sorts of decisions.

So what’s the solution? I’m honestly not sure, though I have some theories I want to try out. There are two obvious problems with not having deadlines at all. First of all, some people really do need deadline pressure to get things done, and others simply will go on working on less important things forever in that case. My personal opinion is that those issues can be dealt with without formal deadlines, by constantly re-estimating and re-prioritizing work instead. The second problem is harder to escape, at least if you’re selling your products to people: everyone you’re selling to expects you to tell them what you’ll have done and when it will be done, so it’s inescapable that you’ll end up making some level of date-based feature commitments that then have the potential to put deadline pressure on your team and cause them to cut corners. Some types of agile methodologies can hopefully mitigate those problems for at least part of your development cycle, but most of those work better for consulting-style projects that actually have an end date at some point, and I honestly don’t feel like it’s a solved problem yet for our type of development work. As a company, we do a very good job of (and take immense pride in) standing behind the commitments to our customers, but we can definitely do a better job internally of mitigating the pressure that such commitments put on the development team. I’m hopeful we’ll be trying some new things out in the next release cycle; whether those experiments work out or not, hopefully I’ll get a chance to write about them and tell everyone whether they worked out and why.


It’s a little thing, but . . .

I just found myself writing the following tiny snippet of GScript:

print("Evaluating group with forms " + group.map( \ f -> f.Code ).join( ", " ))

It’s a minor thing, but in Java the map plus the join would look something like this:

String result = ""
for (int i = 0; i < group.size(); i++) {
  if (i > 0) {
    result += ", "
  }
  result += group.get(i).getCode();
}

. . . or some variant thereof using StringBuilders, or possibly some ListUtils.map() function with an anonymous inner class followed by ListUtils.join() or something; either way it’s ugly and involves a lot of typing.  So ugly that I probably wouldn’t bother just for the sake of a debug message that I’ll delete in a day or two anyway. Being able to write it in one compact line of code is a tiny thing, but it’s a tiny thing that makes my life as a developer a tiny bit better.

This is really my first time writing an entire, complicated subsystem in GScript; it’s weird to say given that we’ve put so much work into the language, but the reality is that our system is largely structured as Java on the bottom and GScript on the top where we need configurability; most of us bounce back and forth between the two languages and most infrastructure is done in Java (and I mainly do infrastructure work here), but this time around I’ve managed to push an entire, complicated subsystem out into GScript both because we can make it more configurable that way and because I think GScript is a better language to develop in.

One thing I’m finding is another obvious advantage to having less code; when I’m developing something new a given line of code has a lifespan of about 48 hours before I refactor or rewrite it, so being able to do things in as little code as possible lets me iterate much, much faster since there’s less to delete when I change my mind and building up the new version goes faster as well.

Just another example of how all those little things really add up.


Being Wrong

add to del.icio.usAdd to Blinkslistadd to furlDigg itadd to ma.gnoliaStumble It!add to simpyseed the vineTailRankpost to facebook
Since my blog posts largely consist of short follow ups to keefs magisterial posts, where I try to say essentially what he said and hope that, by association, I appear smart, I’ll follow up his “Getting it Right vs. Being Right” post with a practical piece of advice for senior developers who want to foster the type of environment he outlines:

Admit you’ve screwed up. Often. And Loudly.

Even the best coders inflict horrors on the code base from time to time. It is cathartic and, perhaps, even crucial that the best developers admit this openly and enthusiastically about themselves in front of other developers and, especially, in front of management. This does two things.

Most importantly, it freaks management out.

At first.

Then, if they are reasonable, they come to realize that, despite the fact that their ace programmer has admitted all of these systems he has designed are screwed up, things are actually limping along reasonably well. So maybe, despite the fact that imperfect humans are implementing this software, they’ll get something usable and useful in the end. And, if they listen closely, maybe they can even make a good guess where the technical debt that is going to eat up 50% of the next release is. (NB: when good developers screw up, they often screw up AWESOMELY on some of the core parts of the system. Loads of fun unwinding that sort of stuff.)

Secondly, it allows the other developers to relax with respect to their own limitations in the face of complexity. It allows them (or teaches them) to be humble, without being humiliated. People admit when they are going off the rails and when they need help, and bad ideas don’t get as far into the system. The flow of information about the state of the system becomes less clogged with egos. And it fosters a sense of community, where we are all in it together against our own fallibility.

Screwing up software sucks. But, if you are developer, you have. And so have all your coworkers. Maintaining a sense of humor and brutal honesty about it is the best way to deal with this universal (and hilarious) fact.


Getting It Right versus Being Right

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook
My undergraduate degree was in philosophy rather than computer science, and while many people might feel that a philosophy major is incredibly impractical, in many ways it helped me hone a lot of skills that are tremendously useful as a software developer. One of the more important things it taught me was to learn how to be wrong; to do philosophy well, you have to stop caring about being right and start caring about finding the right answer. Trying to be right implies that you care about convincing other people that your position is right, and about having other people agree with you; your goal when writing or discussing things is persuasion. On the other hand, trying to find the right answer means that you don’t care about where the right answer comes from so long as it’s found, or at least you get closer to it; your goal when writing or discussing things then becomes to improve the state of knowledge and the state of the debate, even if it means having people disagree with you or having your arguments shot down.

It might seem like an obvious or silly distinction, but it’s a difficult one to internalize: we all naturally want for other people to agree with us, and we all naturally hold the views we hold because we think they’re the right ones, so the natural tendency is to dismiss contrary viewpoints out of hand and to attempt to, by force of will or rhetoric, convince other people that our beliefs are correct. Truly trying to find the right answer requires not just humility but also self-awareness.

So . . . given that we’re a software development company, what does this have to do with software? In my experience, the same principle is applicable to technical decisions: in an ideal world, developers should keep their ego in check, present alternatives as clearly and fairly as possible despite their own leanings, and be willing to accept criticism and to admit when better alternatives are proposed.

There’s a second half to the “getting it right” bit, though: whereas in philosophy the chance of actually settling on a real answer to a debate is effectively zero, in software there often is, given a problem and a certain set of constraints, a small set of fairly superior choices, and you do have some chance of actually landing on one of them. And once you do get there, you’re effectively done with that question: you’ve nailed it well enough that you don’t have to deal with it again until things change enough to render the decision no longer the best one. I think of that as “getting it right eventually.” An important property of software development is that getting that right answer is far, far more valuable than the kind-of-working answer; something implemented in 1000 lines of straightforward, consistent, flexible code is far more valuable than that same thing implemented in 5000 lines of hacked up, inconsistent, hard to understand code. That matters less at the start of a project, but as a system ages and accumulates code those little differences start to add up, and they can eventually be the difference between getting out version 5.0 and having the entire product ground to a halt under the weight of a thousand tiny bad decisions.

It’s also worth noting that, in my experience, no one remembers how many tries it takes you to get something right. No one remembers the three designs you coded up and threw away, what they remember is that you eventually got to the right answer. Getting to the right answer is hard enough, valuable enough, and persistent enough (i.e. once something’s right you stop having to fuss with it) that it’s all anyone will remember.

That all, of course, presumes that you have an organization that can actually function properly; a common, debilitating organizational pathology is to punish people for being wrong. Naturally, that encourages people to “be right” and try hard to convince everyone else that they’re right even when they’re not. That, naturally, pretty quickly leads to bad decisions and failing products.

So let’s assume that you work at an organization where failure is acceptable and that understands the need to fail a few times in order to really get the right answer. Here’s what you, as a developer, can do to take that idea to heart.

Present Your Ideas Clearly And Alternatives Fairly

Avoid the instinct to present positions you disagree with as some caricature that no reasonable person could possible take seriously; do your best to give them a fair shake. If you’re arguing with someone and they hold a reasonable position that they just happen to explain or defend poorly, do your best to construct a reasonable argument for their position before attempting to explain why it’s wrong, rather than picking on their poor explanation or defense. Likewise, avoiding glossing over flaws in your own proposals or using rhetorical tricks to convince people you’re right; be honest about the drawbacks and point them out yourself, then explain why the idea is still worth pursuing anyway.

Your guiding principle should be that, if all possible positions are given the clearest possible explanation and defense, the right one will be blindingly obvious to all parties, so you don’t need to persuade people or attack straw-men. (In some ways, that’s kind of the same idea as the US legal system, though how well that works in the legal system is an entirely different question). It might not actually work that way in practice, but ideally you should act as if it did.

Don’t Steamroll People

This one also might fall under the “well, duh” category, but it’s worth pointing out that some people have stronger personalities than others; if you’re one of those people, you need to be very careful not to win debates simply by tiring everyone else out or metaphorically beating them into submission. While being right might be satisfying, having people agree with you merely because they don’t have the energy to fight with you doesn’t do anyone any good.

Develop Your Wrong-Detection Instincts

Having a taste for what’s a good idea or a bad idea is a critically important, and quite difficult, skill for a software developer to have. You need to know when you’ve reached the best answer you’re going to get to versus when you’re sure your idea isn’t right but you just don’t know what a better solution is. When the answer feels wrong, keep pushing; maybe it’ll come to you in a month, maybe someone else will think of it, maybe it’ll never come and things will always be difficult. It can take a long time (years, even decades) to develop the instinct, but work on it: if things seem too hard, or too messy, or too inconsistent, keep pushing for a better answer, and be willing to recognize it if someone else presents it.

Don’t Be Afraid To Be Wrong

Developing a taste for what designs/implementations/architectures/etc. are better or worse takes time, and the only way to really get there is to just throw your ideas out there and see what happens. If people tear them apart, try to learn from it. If people don’t, and they agree with you, congratulations; you’ve just made a valuable contribution to the development effort! Even if you’re wrong 95% of the time at first, the 5% of the time that you suggest a better alternative is incredibly valuable, and the rest of the time you’ll be learning a lot. The worst thing you can do is to keep your ideas to yourself because you’re afraid you’ll be shot down. And the worst thing an organization can do is create an environment where it’s not okay to be wrong.

Be Wrong Gracefully

When your ideas are attacked, and the criticisms are valid, accept it gracefully. This one is hard because of the emotions involved; no one likes being wrong or feeling stupid, no one likes to be wrong publicly in front of their peers, and there’s a definite ego and self-esteem hit that results from it. Learn to be secure in your skills and understand that by being willing to put out ideas that could be attacked you’re actually helping to get to the right answers faster.

Optimize For Being Wrong

Getting things right in software is hard, so your development process should try to optimize given the reality of needing to potentially iterate on something multiple times to get it right. That can mean everything from agile development practices and rapid iterations to keeping APIs private until you’re sure they’re right. Assume you’ll do things wrong a few times and plan accordingly.


Advanced Enhancements

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook
In Keef’s post pointing out some problems with generics in java, he made a quick mention of enhancements of generic types. It’s an interesting feature of enhancements in GScript so I thought people would enjoy a more in depth explanation.

Often you want methods associated with a particular parameterization of a class. This is especially true of Collections. A canonical example is that, rather than typing:

  Collections.sort( myListOfComparables )

You would rather type:

  myListOfComparables.sort()

The rub is that sort() only makes sense for Lists of Comparable objects. It makes sense on a List<String>, but not on List<Object>. You can imagine other methods that are sensitive to the type of values that a Collection holds: min() and max() on Collections of Comparables, sum() on Collections of numbers, etc.

It turns out that GScript Enhancements let you put these methods where they belong. As an example, here is a simple definition of that sort() method:

enhancement GWBaseListOfComparablesEnhancement<T extends Comparable>
                    : List<T>
{
  function sort() : List<T>{
    Collections.sort( this )
    return this
  }
}

NOTE: sorry I had to wrap the enhancement definition. The : List<T> would normally be on the same line as the enhancement’s definition

So what is this saying? We are defining an enhancement that applies to all Lists of T, where T is bounded by the extends Comparable clause. The sort() method simply passes the enhanced object through to java’s standard Collections.sort() method. This type checks because we know T extends Comparable.

With that enhancement defined, any list that is parameterized on a Comparable class will have a sort() method on it. Everything is nice and type-safe.

Nice.


Some Ways To Improve On Java’s Generics

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook
Plenty of ink has already been spilled over the issue of Java generics, so perhaps I’m just adding to the noise with this; hopefully I’ll manage to add something useful instead. When I first started diving into generics, I honestly didn’t think they were that bad: in simple cases they’re clearly a huge improvement in both clarity and safety over just casting all over the place. There are also plenty of cool things you can do with generics to make reflective programming more typesafe, and plenty of other uses you can put them to as well.

Now that I’ve used them fairly extensively and pushed the limits in pretty much every direction, I’m starting to learn when to back off and avoid using them. Generics, like static typing in general, provide a tool to help you reduce the number of errors in your program. If the cost of using the tool outweighs the benefit you get from the tool, you just shouldn’t bother using it. Too often I’ve found myself spending a long time puzzling over complicated generics statements that, in reality, will help me prevent one really obvious bug that will be caught by our automated tests and fixed in about 5 minutes; spending 2 hours to lower the chance from 3% to 0% that I’ll have to spend 5 minutes fixing something isn’t a win.

For example, a few months back I was rewriting the part of our system that defines how policy products are defined. It’s a highly customizable area of the application, and to generate new configurations on the fly for test purposes we use a builder pattern. I found myself writing code with method signatures like:

protected <L extends ProductModelObjectWithCode, O extends ProductModelObjectBase>
ProductModelFKPopulator<L, O> createFKPopulator(ProductModelLinkField<L, O> field,
ProductModelBuilderBase<L, ? extends ProductModelBuilderBase> builder) {
  return new ProductModelFKPopulator<L, O>(field, builder);
}

What in the world was I thinking? Here we have a foreign key-like field generified on the linkee object type (L) and the owning object type (O), and our builders are generified on what they build and on the builder type itself (for covariant returns). So this is saying that you can create a field populator based on a nested builder if the builder you pass in builds objects that are the same type as the linkee of the field you want to populate. I’m not sure that declaration is even correct, since it should probably be ? extends L in the ProductModelBuilderBase argument declaration. The point is that clearly I was temporarily insane when I wrote this: it’s trying to prevent an error that I’m unlikely to make and that, if I make it, will be easy to detect and fix. In order to prevent it, I’ve managed to make the method signature completely incomprehensible and introduced the risk that someone will suffer a seizure merely from looking at it. And on top of that, it’s probably not even strictly correct and will prevent some valid calls from being made. Writing that was not exactly the most productive use of my time.

So certainly it’s helpful to know when generics just aren’t worth the trouble. Some parts of generics could be made more useful or lightweight, however, and with that in mind in GScript we support Java’s generics but with some changes that we think make generics easier to work with, more useful, and less heavyweight. There are also a few other directions we’ve considered going in but haven’t yet decided are worth the additional complexity.

Wildcards Aren’t Worth It

My understanding of this is that wildcards exist primarily to plug the array type hole. Suppose you have Shape with two subclasses, Circle and Square. In Java, Circle[] is assignable to Shape[], but you can’t actually treat a Circle[] as a Shape[], since you can store a Square in a Shape[] but not in a Circle[]. If you try to do that, you’ll get an ArrayStoreException at runtime, but it can’t be caught at compile time.

The need to catch that statically with generics was probably even greater because generics aren’t reified the way array types are, so there’s not even a way to catch that you’re storing a Square in a List<Circle> in Java. You’ll only get a ClassCastException when yanking the Square out and treating it like a Circle.

To combat that, in generics List<Circle> is not a subtype of List<Shape>. Instead, it’s a subytpe of List<? extends Shape>. If you have a variable of type List<? extends Shape> you can read from it (you get back a Shape), but you can’t add to it, because you don’t know what the actual concrete type is. More generally, you can’t call a method that takes a wildcard type, but you can have the wildcard type as a return value and it’ll be inferred to the bounding type.

The problem is that the wildcards tend to filter through your code in all sorts of horrific ways, especially with interfaces. You have to be really, really careful to make sure that interface methods return wildcard types everywhere. Having an interface method that returns List<Shape> means that someone building up a List<Square> for a local variable can’t return that as the value of the method, so you almost always have to be careful to make it List<? extends Shape>. That’s confusing and verbose, and it also tends to ripple through the code, since all the other methods relating to lists of shapes need to be properly wildcarded as well in case someone wants to pass the return value from one method as an argument to the next method. Fixing a situation where an interface was written to return a List<Shape> instead of List<? extends Shape> can quickly balloon into an exercise that requires changes to dozens of files that really shouldn’t be related.

In GScript, we’ve basically relaxed that constraint, such that List<Square> is, in fact, assignable to List<Shape> so you don’t have to use wildcards. It might mean that we don’t statically catch as many problems, but in my experience those array store type problems come up so infrequently and are so easy to fix that it’s just not worth the cost of putting wildcards everywhere. Generics work how you intuitively think they should, you don’t have to think about it too much, and the price of letting a few array-store-type exceptions through every now and then isn’t really very high.

Generics Need Variable Type Inference

Typing generic type names can get pretty old after a while. The most obvious, annoying case is something like:

Map<String, List<String>> myMap = new HashMap<String, List<String>>();

I shouldn’t have to repeat the information on both sides. You can avoid that in Java by just not including the generics on the right hand side (in which case the compiler will warn you) or by calling a helper method, thanks to the way that generic parameters are type inferred. In those cases, you can at least omit them on the right-hand side. But none of that helps you get out of this situation:

Map<String, List<String>> myMap = someMethod();

someMethod has to be declared to return a Map<String, List<String>> but I still have to declare my variable of that type. You better hope your IDE has good refactoring tools if you want to change the return type of that method, or else you’ll be doing a lot of typing or cut and pasting. A lack of type inference is already painful in a strongly-typed language; generics just makes it worse by making the type declaration strings even longer.

GScript deals with this by letting you omit the type declaration for variables that have an assignment statement, so you could do:

var myMap = new HashMap<String, List<String>>();
var myMap = someMethod();

which makes the generics overhead more bearable.

Self-Parameterization Should Be Easier

We haven’t done this one yet, but a common pattern in a lot of our more reflectively-driven code is to parameterize a type on itself. In the builder example above, we parameterize all of our data builders on the type of the builder, so we can have methods like:

B create();

where the method will automatically return the right thing. You could accomplish the same goal by covariantly overriding methods, which is the easier route to go if you only need to do it a few times. In our builder case, where there are methods on supertypes that return the builder back out, self-parameterization is much more convenient than covariantly overriding every possible method on every superclass in the chain. Self-parameterization can also be useful for writing reflective libraries, though it can also be overkill as I’ve demonstrated above.

Unfortunately, self parameterization is a pain; you have to type it in everywhere, leading to declarations like:

CoveragePatternBuilder<B extends CoveragePatternBuilder> extends ProductModelBuilderBase<CoveragePattern, B>

On leaf types, you have to specify the type explicitly in the extends clause, whereas for non-leaf types you have to declare the type variable and then pass it through in the extends clause. It’s doable, it’s just cumbersome.

One thing we’d kicked around is the idea of having a special SELF parameter type declaration that indicates that the type variable should be bound to the value of that type; i.e. it would implicitly be treated like T extends Foo when you used the SELF type variable on class Foo and T would automatically be bound appropriately on subtypes. We haven’t really nailed down exactly how it would work or decided to implement it, but it’s something that’s on the table for the future that might make it much easier to deal with something that, at least in our code, has turned out to be a reasonably common pattern.

Runtime Generic Information Is Helpful

The Java type erasure horse has been beaten to death by this point, so I’m not going to harp on it: it’s a complicated decision, they did it for certain reasons, and it’s not going to change. I’ll merely say that in GScript we can do some amount of un-erasure and reification (though not completely, since Java is still managing the bytecode for those classes and any Java code that creates or uses them); basically we just have first-class types in our system for the different generified versions of a type, and enhancement methods can be added specifically to those types. For example, our base set of enhancements provides sum() and average() methods on Collection<Number>. In Java, there is no such type, so there’s nowhere to hang those sorts of methods. So the types exist statically in our system enough to let you enhance them, but they’re still essentially lost at runtime.