Feature Literals

In the current open source release of Gosu there is a new feature called, er, feature literals. Feature literals provide a way to statically refer to the features of a given type in the Gosu type system. Consider the following Gosu class:

  class Employee {
    var _boss : Employee as Boss
    var _name : String as Name
    var _age : int as Age

    function update( name : String, age : int ) {
      _name = name
      _age = age
    }
  }

Given this class, you can refer to its features using the '#' operator (inspired by the Javadoc @link syntax):

  var nameProp = Employee#Name
  var ageProp = Employee#Age
  var updateFunc = Employee#update(String, int)

These variables are all various kinds of feature references. Using these feature references, you can get to the underlying Property or Method Info (Gosu’s equivalents to java.lang.reflect.Method), or use the feature references to directly invoke/get/set the features.

Let’s look at the using the nameProp above to update a property:

  var anEmp = new Employee() { :Name = "Joe", :Age = 32 }
  print( anEmp.Name ) // prints "Joe"
  
  var nameProp = Employee#Name

  nameProp.set( anEmp, "Ed" )
  print( anEmp.Name ) // now prints "Ed"

You can also bind a feature literal to an instance, allowing you to say “Give me the property X for this particular instance“:

  var anEmp = new Employee() { :Name = "Joe", :Age = 32 }
  var namePropForAnEmp = anEmp#Name

  namePropForAnEmp.set( "Ed" )
  print( anEmp.Name ) // prints "Ed"

Note that we did not need to pass an instance into the set() method, because we bound the property reference to the anEmp variable.

You also can bind argument values in method references:

  var anEmp = new Employee() { :Name = "Joe", :Age = 32 }
  var updateFuncForAnEmp = anEmp#update( "Ed", 34 )
  
  print( anEmp.Name ) // prints "Joe", we haven't invoked 
                      // the function reference yet
  updateFuncForAnEmp.invoke()
  print( anEmp.Name ) // prints "Ed" now

This allows you to refer to a method invocation with a particular set of arguments. Note that the second line does not invoke the update function, it rather gives you a reference that you can use to evaluate the function with later.

Feature literals support chaining, so you could write this code:

  var bossesNameRef = anEmp#Boss#Name

Which refers to the name of anEmp‘s boss.

You can convert method references to blocks quite easily:

  var aBlock = anEmp#update( "Ed", 34 ).toBlock()

Finally, feature references are parameterized on both their root type and the features type, so it is easy to say “give me any property on type X” or “give me any function with the following signature”.

So, what is this language feature useful for? Here are a few examples:

  1. It can be used in mapping layers, where you are mapping between properties of two types
  2. It can be used for a data-binding layer
  3. It can be used to specify type-safe bean paths for a query layer

Basically, any place you need to refer to a property or method on a type and want it to be type safe, you can use feature literals.

Ronin, an open source web framework, is making heavy use of this feature. You can check it out here:

  http://ronin-web.org

Enjoy!


I hate to beat a dead horse and all . . .

. . . but this is pretty egregious.  I got so sick of writing my own partition code in Java (since I’m so used to being able to do it easily in GScript) that I pushed it out into a utility method so I wouldn’t have to rewrite the same code over and over.  Thanks to the overhead of Java generics, anonymous inner classes, and type declarations, I’m not sure it was even a win.  Now my code looks like:

  Map<String, List<FormPatternLookup>> partitionedLookups =
    CollectionUtil.partition(lookups,
    new CollectionUtil.Partitioner<FormPatternLookup, String>() {
      public String partitionKey(FormPatternLookup formPatternLookup) {
        return formPatternLookup.getState() +
          ";;" + formPatternLookup.getUWCompanyCode();
      }
  });

Ugh.  If I had written this code in GScript, it would have been:

  var partitionedLookups = lookups.partition(\l -> l.State +
    ";;" + l.UWCompanyCode)

I mean, honestly . . . that’s pretty brutal.


Sorting A List


Note:I’ve rewritten this post based on feedback from Neal and Ted in the comments. I’ve left the inflammatory comments in (this is a blog, after all), but tried to do a more apples-to-apples comparison between GScript and java. The original article can be viewed here.


My first post showing some GScript on this blog compared sorting in Java with sorting in GScript. Today I spent a lot of time going over the psychosis of Comparators in java, and it got me thinking again about how differently Java and GScript approach the problem of sorting lists. I’ll split the discussion into two parts: the code developers must write to sort a list and the actual signatures of the sorting methods used.

The “Client Side”

Comparators, as everyone knows, are java’s way of defining orderings for collections of objects. In Java 1.5 and greater, they are parameterized on T: if you want to sort a List<T>, you pass in a Comparator<T>. Sort of. We’ll get to that. First, let’s look at a simple (!!!) example of sorting in java:

   List>Employees> someEmployees = getEmployees();
   Collections.sort( someEmployees, new Comparator<Employee>(){
    public int compare( Employee e1, Employee e2 ) {
      return e1.getSalary().compareTo( e2.getSalary() );
    }
  });
  return someEmployees

What that code does is sort a collection of employees by Salary. You are forgiven if you can’t tease that fact out of that mess.

As I wrote in the previous post, the GScript to accomplish the same task is:

  return Employees.sortBy( \ e -> e.Salary )

GScript uses closures to boil the operation down to a single line of code and type inference to minimize the syntax of that line. What do we want to do? Sort the list by something. What is that something? The employees Salary. We know that Employees is a list of Employees, so why should we have to specify the type of the “e” parameter to the closure? A GScripter can let the compiler take care of a lot of the details, and still gets nice code completion since everything is statically typed.

The “Server Side”

Now let’s take a look at the other side of the fence: the signatures of the methods used for sorting.

The signature of Collections#sort(list, c) is:

static <T> void sort(List<T> list, Comparator<? super T> c)

What the hell does that mean? Basically, it means you can pass in any comparator that is parameterized on T or a supertype of T. This is because the comparator is going to be invoked with objects of type T, so only comparators that take that type or *higher* in the inheritance chain will work. In particular, Comparator<? extends T> will *not* work, because the comparator might be expecting a subtype of T. This is a case of contra-variance, which is fairly rare in type-systems. We are used to the opposite, variance, where a subtype of T is acceptable in place of T. Some programming languages, notably Scala, allow you to annotate type variables with the particular variance they allow, usually using a ‘+’ or ‘-‘ sign.

(As a side note, as I’ve said elsewhere, any non-trivial application of generics requires me to stop and put on my generics hat, and, ten minutes after I’ve understood and solved the task at hand, *poof*, it’s gone.)

Interestingly, since java lacks any way to specify variance, internally Sun engineers have to cast the Comparators to the generic type to do comparisons. I’m not lying. Check out the implementation. The implementation of TreeSet is even better, with the developer putting in little /*-*/ comments in the places to indicate variance more correctly when he has to cast to the generic type.

So here we have a lot of syntax flying around to make sure sorting lists is absolutely, positively, 100% type safe. And, despite all that, the sun engineers still have to crosscast to the generic type to make the damned thing work. All to prevent what has to be one of the rarest programming errors I can imagine: accidentally sorting a List with a bad Comparator. I’ve never seen that happen. I’ve never heard of that happening. I’ve never even heard it mentioned that it might happen. And yet all that complexity has been thrown at this problem in java.

The signature of sortBy() is:

  function sortBy( value(T):Comparable ) : List

and it is injected onto List via GScript Enhancements.

The parameter is declared to be a block that takes an argument of type T and returns a Comparable object, which will be used to order the list. (We eventually delegate to Collections.sort() internally so that the sort occurs in java, rather than GScript. Hey, we still want it to be fast, right?)

Note that Comparable is left generic because there is almost zero chance someone will screw a call to sortBy() up from a typing perspective, and if they did, the incomprehensible generics error message would be far harder to understand than then inevitable runtime exception that would occur. It just isn’t worth the complexity to specify it.

Yeah, so?

Well, to me, GScript throws complexity at the right part of the problem. It takes a common operation on Lists, sorting by some attribute of their component elements, and boils it down to its essence using closures. Closures, like generics, are somewhat complicated and can take a bit to get your head around. But once you grock closures in GScript you end up having to know less, not more, about the complexities of sorting. This is in stark contrast with the generics in Collections.sort(). That application of complexity doesn’t help developers much, if at all. The only time you will ever notice it is when it gets in your way.

The overriding theme of GScript, formally spelled out by Scott at the start of our Bedrock release: “GScript features are about making developer’s lives easier.” I’m convinced that if static language designers make pragmatic concessions to real-world developer needs and the acknowledge the practical limitations of static typing, the current excitement around dynamic languages would temper dramatically.


It’s a little thing, but . . .

I just found myself writing the following tiny snippet of GScript:

print("Evaluating group with forms " + group.map( \ f -> f.Code ).join( ", " ))

It’s a minor thing, but in Java the map plus the join would look something like this:

String result = ""
for (int i = 0; i < group.size(); i++) {
  if (i > 0) {
    result += ", "
  }
  result += group.get(i).getCode();
}

. . . or some variant thereof using StringBuilders, or possibly some ListUtils.map() function with an anonymous inner class followed by ListUtils.join() or something; either way it’s ugly and involves a lot of typing.  So ugly that I probably wouldn’t bother just for the sake of a debug message that I’ll delete in a day or two anyway. Being able to write it in one compact line of code is a tiny thing, but it’s a tiny thing that makes my life as a developer a tiny bit better.

This is really my first time writing an entire, complicated subsystem in GScript; it’s weird to say given that we’ve put so much work into the language, but the reality is that our system is largely structured as Java on the bottom and GScript on the top where we need configurability; most of us bounce back and forth between the two languages and most infrastructure is done in Java (and I mainly do infrastructure work here), but this time around I’ve managed to push an entire, complicated subsystem out into GScript both because we can make it more configurable that way and because I think GScript is a better language to develop in.

One thing I’m finding is another obvious advantage to having less code; when I’m developing something new a given line of code has a lifespan of about 48 hours before I refactor or rewrite it, so being able to do things in as little code as possible lets me iterate much, much faster since there’s less to delete when I change my mind and building up the new version goes faster as well.

Just another example of how all those little things really add up.


Advanced Enhancements

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook
In Keef’s post pointing out some problems with generics in java, he made a quick mention of enhancements of generic types. It’s an interesting feature of enhancements in GScript so I thought people would enjoy a more in depth explanation.

Often you want methods associated with a particular parameterization of a class. This is especially true of Collections. A canonical example is that, rather than typing:

  Collections.sort( myListOfComparables )

You would rather type:

  myListOfComparables.sort()

The rub is that sort() only makes sense for Lists of Comparable objects. It makes sense on a List<String>, but not on List<Object>. You can imagine other methods that are sensitive to the type of values that a Collection holds: min() and max() on Collections of Comparables, sum() on Collections of numbers, etc.

It turns out that GScript Enhancements let you put these methods where they belong. As an example, here is a simple definition of that sort() method:

enhancement GWBaseListOfComparablesEnhancement<T extends Comparable>
                    : List<T>
{
  function sort() : List<T>{
    Collections.sort( this )
    return this
  }
}

NOTE: sorry I had to wrap the enhancement definition. The : List<T> would normally be on the same line as the enhancement’s definition

So what is this saying? We are defining an enhancement that applies to all Lists of T, where T is bounded by the extends Comparable clause. The sort() method simply passes the enhanced object through to java’s standard Collections.sort() method. This type checks because we know T extends Comparable.

With that enhancement defined, any list that is parameterized on a Comparable class will have a sort() method on it. Everything is nice and type-safe.

Nice.


Language Comparison – Properties

It’s generally accepted wisdom by now that giving client objects direct access to data stored on a class is a bad idea; it violates encapsulation, prevents you from changing implementations in interface-neutral ways, and prevents you from doing other things like lazy-computation or evaluation that might eventually prove necessary for the sake of correctness or performance. In addition, it’s often useful to have operations that look like simple field gets and sets but that actually do additional logic and aren’t tied to any particular data value.

Most modern languages, then, have adopted conventions and/or techniques around using methods to expose private data. In Java, for example, the standard approach is to make all instance variables private and to use get/set/is methods to access and manipulate the data instead. It’s a very useful convention, but it’s such a fundamental pattern that other languages have gone even further by making it possible to expose methods that look like variable accesses, allowing for variable-like accesses and assignments. The names aren’t the same across all languages, but I’ll use the term “property” to refer to get/set methods that are exposed as simple variable accesses, since that seems to be the most common term. So let’s look at how you go about defining and using properties in each language. The use case, for illustration purposes, will be a User object that has a Name property.

Java

Java doesn’t have built-in support for properties, so you have to define the getters/setters yourself:

private String _name;
public String getName() { return _name; }
public void setName(String name) { _name = name; }

Similarly, to access the property you have to call the method:

user.setName("Bob");
System.out.println("Hello, my name is " + user.getName());

The Java introspector, used for JavaBeans, will combine the getName() and setName() methods into a single Name property that is both readable and writable, so for the sake of reflective access using BeanInfo there is some concept of a property. Not at the source code level, though.

GScript

GScript has a first-class notion of properties, allowing you to declare things like so:

var _name : String
property get Name() : String { return _name }
property set Name(name : String) { _name = name }

Since defining a property to wrap a private variable is such a common pattern, there’s also a shorthand notation for that:

var _name : String as Name

Which is equivalent to the above declarations. You can also start by using the “as” syntax and then explicitly define the property getter or setter if you like. Using the properties is what you’d expect:

user.Name = "Bob"
print("Hello, my name is " + user.Name)

There are a couple of other things worth noting here. First of all, GScript follows the JavaBeans naming convention that properties start with a capital letter, which helps make clear that they’re properties and not instance variables. Secondly, GScript exposes Java classes the same way the Java introspector does, so a Java class with getName() and setName() methods will, from GScript, appear to have a Name property instead of getName() and setName() methods.

Ruby

All instance variables in Ruby are private, and the only way to expose them is via a method. Ruby makes parentheses optional on method calls, allows white space in all sorts of places, and allows all sorts of interesting characters to be used in method names, so ‘=’ is actually a valid character in a method definition. So you could do things in a brute-force manner like:

def name
  @name
end

def name=(val)
  @name = val
end

Note that in the second case, the = character is part of the method name; including = in a method name makes it possible for it to appear on the left-hand side of an assignment statement. I don’t actually know Ruby well enough to say if that’s built in to the parser or just a result of the fact that parentheses are optional and whitespace is permitted in interesting places.

Since this is such a common pattern, Ruby has built-in methods for creating what Ruby calls “attributes,” their name for the property concept. The methods are just like any other Ruby method and use metaprogramming to dynamically create the getter/setter methods. So the above code could be simplified down to:

attr_accessor(:name)

Note that this makes use of Ruby’s concept of symbols (that’s what :name is; the ‘:’ character at the start denotes this as a symbol), which are used all over the place for metaprogramming in Ruby.

Accessing the attributes in Ruby looks like what you’d expect:

user.name = "Bob"
puts "Hello, my name is ${user.name}"

Python

Python’s notion of properties is a bit more complicated; in fact, despite reading through all the chapters on object-orientation in Dive Into Python I didn’t know it existed until I read through the Django Database API documentation and clicked through to see how they managed to make their foreign-key references work.

It turns out that Python’s support for properties is via a general-purpose mechanism called Descriptors. Descriptors are used for much more than just properties, but in the context of properties they’re a bit hard to understand since they turn things kind of on their head. A descriptor is an object that has special methods that define what to do when the object is treated like a variable and is accessed, set, or deleted (since that’s possible in Python). So instead of defining getName/setName methods on the User class, in Python what you do is create a descriptor object that’s bound to the class-level “name” variable, and then when that class-level variable is accessed or assigned to the methods on the “name” descriptor object will be invoked. Don’t feel bad if your response to that is “Huh?”

There’s a built-in method called “property” in Python that lets you create a descriptor given the function parameters, so to define the “name” property on the User class in Python you would do:

def get_name(self): return self.__name
def set_name(self, value): self.__name = value
name = property(get_name, set_name)

So what this means is that name is an object, and that object will call back to the get_name on User when it is accessed and to set_name when it’s assigned to. It gets the job done, but not as elegantly as in Ruby or GScript in my opinion; it feels like the bolts are showing a little bit.

The use of the property, though, looks like what you’d expect:

user.name = "Bob"
print "Hello my name is %s" % user.name

Language Comparison – List Mapping

Continuing with the language comparison theme, I’ll look at another common list processing operation: mapping a list to another list. Say I’ve got a list of user objects and I want to turn that into a list of user names instead. Here’s how it looks in each of the languages.

Java

Using the standard for-each approach:

List<String> userNames = new ArrayList<String>();
for (User user : users) {
  userNames.add(user.getName())
}

Or, using a functional-style library:

List<String> userNames = ListUtils.map(users, new Mapper<User, String>() {
  public String map(User user) {
    return user.getName();
  }
};

Note that in this case, using the functional library actually ends up using an extra line, depending on how you format your curly braces. Either way, it’s not concise and it’s not pretty.

GScript:

var userNames = users.map(\user -> user.Name)

Ruby:

userNames = users.map{|user| user.Name}

In Ruby, there’s always more than one way to do things, so “map” and “collect” are synonyms for the exact same function. In addition, there are destructive versions named map! and collect! that modify the array in place instead of creating a copy.

Python:

userNames = [user.getName() for user in users]

This is an example of a list comprehension, which is basically map + filter + iteration built in to the language (though my guess is that it’s not the most efficient way to iterate a list, since it creates a new list . . . I could be wrong there, though). Python does also have a built-in map() method that takes a function and an iterable, but the idiomatic python way is to use a list comprehension. My Python knowledge is also letting me down here; I haven’t seen any language-level support for properties in Python like there is in Ruby or GScript, so I’m assuming that you’ll have to actually perform a method call here to get the name out.