Why I don’t consider Programs to be Models

I attended Zef Hemel’s PhD defense in Delft earlier this year. Congratulations, Zef! Zef defended his thesis on Methods and Techniques for the Design and Implementation of Domain-Specific Languages. It was my first experience with a traditional European style thesis defense, complete with robes, a large silver staff, and a formalized interrogation of exactly one hour. I had a great time. You can’t see me in his photo of the event, because I am directly behind Zef and Eelco. Unfortunately when it came time to interrogate Zef, my first question was “What is a model”? It seemed like fair game, given that the word appears in the thesis title. On second thought, I realize that anyone would struggle to define the word.

Zef also has an active blog, I am dr. Zef. I seem to remember that it used to be called “I am Zef”, but I’m not sure. One of his posts is Programs are Models. This idea is consistent with many people in the model-driven community. The mantra is that “everything is a model”, but we have high-level models which describe the system clearly and low-level models (aka code) that can be executed. You get from high-level models to low-level code models by applying a transformation. Transformations are also models. Since everything is a model (including Java source code) it all works great: everything is a model, including transformations between models.

Zef points out that high to low level transformations have typically been called “compilers”. He also argues that internal/embedded DSLs are favored in industry and are a better way to go. One small point is that embedded DSLs are quite popular in academia too, especially for those working with Haskell.

The debate about the merits of internal versus external DSLs is far from over. They both have strong advantages and significant disadvantages. For example, internal DSLs tend to have very poor error messages and debugging abstractions. They also can be difficult to analyze, because they are mixed with general-purposed code. External DSLs require a lot more tooling, as he points out. But the debate is far from other.

My main point here, however, is that I prefer to not think of programming languages as modeling languages. The reason is that, for me, a modeling language must be about what behavior is desired, not how to implement that behavior. This is the difference between a regular expression that concisely describes a pattern and code that implements the steps to recognize the pattern. As I have said before, models are descriptions written in an executable specification language. Programming languages do not operate at the level of “specification” so they cannot be modeling languages.

One consequence of this decision is that a model-to-programming-language transformation is not really a model-to-model transformation, because programs are not models. I believe that transformations between high-level modeling languages are a fine idea, but using transformations to generate code is a bad idea. The Enso team is investigating a view of model-driven development that is completely based on interpretation, with no explicit code generation at all. This is a good discussion of some of the issues, but I don’t think it touched on some of the more fundamental questions. For example, my working hypothesis is that it is easier to compose, modify, and extend interpreters than compose, modify, and extend compilers, when combining multiple languages together.

Enso is built on the following principles and strategies:

  • External DSLs, not internal/embedded
  • Transformations are essential, but not for generating code
    • Grammars are models that define bi-directional transformations between models and text
    • GUIs are models that define bi-directional transformations between models and presentations
  • Interpretation, not code generation
    • Interpreters are written using code. Code is good!
    • The interpreter language must be able to access/modify models easily, as if they were the native data of the interpretation language
  • It is never the case that “Everything is an X”

In conclusion, I have to say that I agree with Zef that the most interesting work on modeling is being done in projects like Ruby on Rails, Play, JQuery, etc. Note that even these systems use a blend of internal and external DSLs. One other thing in common is that most of them don’t spend a lot of time generating code, but interpret models directly. Industry people are making great progress using the tools that are at hand (especially dynamic languages), but that doesn’t mean there can’t be a better way to do things.


Comments (2)

  1. dmbarbour wrote::

    I hold a very precise middle-ground between “interpretation” and “code generation”. I strongly favor staged programming. The “stages” in question may be described by objects rather than program text, but they still offer opportunity to prove properties of the code and optimize prior to execution.

    Your views on transforms are interesting, but I think you assume a closed system.

    It is never the case that “Everything is an X”

    But that’s untrue. Everything is a thing. (Attack the utility, here, not the validity! 🙂

    Monday, September 17, 2012 at 9:30 pm #
  2. w7cook wrote::

    Staging is an interesting intermediate point. I tend to include it on the code generation side, because it involves explicit manipulation of representations of code.

    As for “Everything is an X”, good point! I should say “It is never useful to assert that everything is an X”. Is there a more pithy way of saying it?

    Tuesday, December 25, 2012 at 2:52 pm #
Get plugin http://www.fastemailsender.com