Why There Have Been No Posts Lately…

I apologize for the extended absence of posts here on the dev blog, but I’ve been busy with a new Open Source project:

  http://jschema.org

The JSchema Project is an attempt to define a very simple schema system for JSON documents. JSON, for those who are not familiar, is a subset of Javascript that has proven to be a useful and less verbose alternative to XML for data-interchange:

 {
    "group" : "Developers",
    "members" : [
      { "id" : 1, "first_name" : "Joe", "last_name" : "Smith" },
      { "id" : 2, "first_name" : "Jennifer", "last_name" : "Mitchum" }
    ]
 }

It is easy to parse and produce, and integrates well with newer front-end technologies written in Javascript.

The motivation for this project, it will not surprise you, was Gosu. An excellent engineer at Amica, JP Camara, had been playing around with a JSON-based Type Loader for Open Source Gosu. I contacted him about adopting a schema-based approach, and began looking at the industry standard, JSON Schema. It quickly became apparent that JSON Schema was too complicated, and that a small extension to JP’s simple and elegant template approach would yield an expressive schema language, with a Gosu Type Loader to boot.

To give you a taste of how simple JSchema is, here is the schema for the JSON example above:

 {
    "group" : "string",
    "members" : [
      { "id" : "integer", "first_name" : "string", "last_name" : "string" }
    ]
 }

I hope that schema makes intuitive sense to most people: JSchema was designed such that the schema corresponds closely to the documents it describes, and it should be easy to take an example JSON document and transform it into a JSchema schema.

The JSchema specification is still very young but I think it is reasonably complete. We may add some more core data types (a ‘bytes’ datatype for raw byte data seems like it might be useful, for example) but the core ideas are working out well in our Gosu implementation (I’ll blog more about that in a later post).

If you are interested in the specification and/or participating in its design (or making the website prettier), you can fork it here. Participation is very welcome!


16 Comments on “Why There Have Been No Posts Lately…”

  1. Fred says:

    Why are all the number types ‘”integer”‘ | ‘”decimal”‘ | ‘”biginteger”‘ | ‘”bigdecimal”‘ there?
    I thought Javascript had only the Number type? Thanks

  2. Carson Gross says:

    That’s true of Javascript, but it seems like other languages (and in particular, Gosu) make enough of the difference between 64-bit vs. unbounded precision and floating point vs. integer values to warrant the four different data types.

    If a Javascript implementation treated them all the same, that wouldn’t be technically correct but, hey, you are using Javascript. 😉

    JSON places no bounds on the size or precision of numbers, and I thought it would be right for people to offer very precise values (e.g. BigDecimals in Java) while not paying the performance overhead for, say, integer values.

    • I’d prefer to have only “number”, or perhaps “decimal” and “integer”. XmlBeans for instance always maps to BigInteger or BigDecimal. Groovy always instantiates decimal literals as BigDecimals.

      Considering that most web applications are IO bound, performance overhead shouldn’t be taken in account IMO. Afterall we can’t solve everyone problems 🙂

      • Carson Gross says:

        Thanks Ricardo, I’ve come around on that, and I updated the spec a while ago to only include ‘int’ and ‘number’, taken directly from the JSON grammar.

  3. I’ve written a somewhat similar system for work, for a system we’re using to produce, describe, consume, validate and present JSON documents with as little configuration as possible. I ended up specifying the types of fields as arrays, in order of less specific to more specific, with the stipulation that a value must be able to be described at its most basic level as either a blob, text, a number, a list or a hash (these are what I’ve called “Category I” types, named blob, text, numeric, list and hash respectively). There can be an arbitrary number of types attached to a field, but the least specific of them must be a Category I type.

    This is easier with an example: https://gist.github.com/6e7a26106e06bf540999

    In that example, the document (blog-post.json) is described by the schema file (schema.json). Applications can define their own types without breaking the ability for applications that don’t understand them to present documents described with them. Applications that do understand them, however, can present more specific user interfaces and perform more accurate validation on documents.

    (Note: I’m at a McDonalds right now, so this may be a little disjointed; I’ll come back and clarify some points in a few hours…)

  4. Joby Taffey says:

    Have you considered any way to specify signed vs. unsigned and fixed point numbers?

    Eg. the fractional component of a fixed point integer might only be allowed to step in chunks of 1/256.

    • Carson Gross says:

      I figure that the ‘bigdecimal’ type can be used to represent exact fixed point numbers. That’s what we use for things like currency amounts internally, for example.

  5. Florin says:

    Do you have a schema validator already?

  6. ldfskjsdfkl says:

    Why does the schema have to be valid JSON itself?
    Not saying your example looks bad, it’s actually pretty ok compared to many other proposals i’ve seen. I just want to encourage some thinking out of the box if that would create a better solution.

    • Carson Gross says:

      The idea is to have the schema document look very close to the described documents. This, I think, makes it intuitive and obvious what the described documents will look like. It also makes it very easy to take an example document and convert it into a schema. Finally, I’ve always wanted to use the word “homoiconic” to describe a project I’m involved with, even if I’m using it incorrectly.

      Since I don’t expect rapid adoption of the proposal, I figured it would be nice if it was easy to take the current “schema” standard (which is to provide a sample document) and convert it yourself (and, in fact, we’ll be providing a tool in the gosu library that automates this to some extent, JP already does this internally in the Gosu Type Loader).

    • Kissaki says:

      Well, the schema will have to be parsed as well, so what reason is there to use a non-JSON format?
      After all, json is a good, readable, parseable and simple format.

  7. Kissaki says:

    Did you decide or think about how to handle optional fields?

    Sometimes you may want to implement robust applications who will handle any subset of the schema-defined data, even if it is correct error output,
    but other times you may very well want to explicitly define which fields are optional, so you can be sure some are available. Especially when you’re not just providing a callable API but other ppl. generate data with your schema as well.

    • Carson Gross says:

      I was unable to come up with a way to specify that a field was optional or required that didn’t involve either encoding additional information in the type names (which is really a dodge around the grammar restriction I put on myself) or sacrificing the homoiconicity of the schemas (beyond what i had to do for enums).

      It’s a hard feature to omit, but I take solace in the fact that languages have been productively used for many years now with nullable values (despite all the problems) and that comments can be used to communicate additional constraints to clients.

  8. Ben says:

    hi, i just wanted to know out of my cuiriosity that whether its possible to do a left outer join or right outer join in gosu, please excuse me since i was not able to find a better place to ask this.


Leave a reply to Kissaki Cancel reply