Gérard Berry
We Speak Your Language S01 E04
Intro: Hello, welcome to ‘We Speak Your Language’ The podcast for computer language geeks and nerds. This fourth episode is hosted by Darius Blasband and is brought to you by Raincode Labs. Enjoy!
Darius: I am Darius, and I will be your host today, here at “We Speak Your Language”, the podcast for programming language nerds.
There are many angles to choose from to introduce today’s guest Gérard Berry, but rather than go through the entire list of his amazing achievements, I just can’t resist the opportunity to start this with a tangent of sorts.
The Shadoks (spelled with one d and one k) are birds, or according to Wikipedia, bird-like creatures. They are the subject of an animated cartoon that was broadcast on French national TV between 1968 and 1974. The Shadoks lived in a two-dimensional universe, they were ruthless and pretty stupid, but more than anything else, they were sensational, in the most original sense of the word. They created a sensation.
Nothing like that had been shown on TV before and even though they haven’t been around for over 45 years now, they have impregnated the French psyche. Their odd, made-up vocabulary have entered dictionaries as a tribute to their status of icons of French pop culture.
And so what? I hear you whine. What do Shadoks have to do with anything?
Please bear with me. We’re getting there.
Gérard Berry was presented to me as a rock star, as being super smart, and as having a fantastic sense of humor. And the token, the absolute symbol of his sense of humor was his willingness to give lectures, but not just any lecture, one of these formal, official, all dressed-up events that are so serious and so French, to which he titled this lecture, “Science and conscience of the Shadoks”.
Gérard, thank you so much for joining us today. The downside of such an introduction is that it doesn’t say much about you. So, could you please present yourself?
Gérard: I started in 1968 at the Ecole Polytechnique in Paris as an amateur with a very strange computer and I loved it right away. I instantly became interested in programming and languages. Then in 1970 I joined the School of Mines in the Informatics Lab, and we developed a language called TIF, “Traitement et interrogation de fichiers” which means in French how to handle and compute on files. It was a kind of Excel, but on taped files. It was fun, but it didn’t go anywhere. I then started doing mathematics, essentially mathematics of programming, since we had no computer for research in France because of a US embargo, and so we had to do theory instead. I started in the semantics of formal languages, which ended up being my main subject over my entire career.
This started when listening to Dana Scott in 1970 who invented the first denotational semantics, and then I made quite a lot of contributions to a completely new semantics of sequential programming. This resulted in the 1980’s in a language called CDS for Concrete Data Structures, which I used to call the language of the 90’s (not specifying the century).
This language was quite nice. It was a high order language, inspired by ML (In 1972, I was happy to have Robin Milner as a private teacher for two weeks at Stanford, on functional programming and formal verification), but much richer semantically. We built a compiler but didn’t have much success. It was novel and good in theory, but it didn’t interest programmers that much.
I then changed my life by working with control theory people, the people who drive cars or airplanes, etc. I realised that classical programming language like C, which was well-known at the time, looks adequate at first sight but are totally unable to program these kinds of things, for many reasons that I will explain later. Parallel languages were already around, with CSP by Tony Hoare, as well as theoretical process calculi by Robin Milner like CCS (Concurrent Communicating Systems). This was all nice mathematics, but I was shocked by the fact that these languages chose to ignore time. They are asynchronous: if you have two programs running in parallel, there is no time relationship, nothing concerning temporal relations between these two programs. This is natural for distant machines that communicate through networks, but not at all for process control.
But somehow, these languages were always disappointing to me. They are complicated, there are too many of them, none really dominate. There is some more recent progress with languages such as Scala, but there are still lots of intrinsic difficulties with these languages. My goal was to program cars and airplanes where you absolutely need timing precision. You need real-time languages.
We realized that the people working in the real-time field were completely pedestrian, they had no theory, nothing. The only thing they cared about is the interference between the time of computation and the time of the process. They used asynchronous programming, for instance in ADA, which was called a real time language although there was nothing real time about it. When we say “wait two seconds” in ADA, if you look at the semantics, the only thing it definitely does not do is wait exactly two seconds! I was shocked by that. It was not what we wanted. I knew mathematics very well, which is my primary field. In control theory, time is very simple. Time is the same everywhere conceptually, and things that talk to each other do so in controlled time. So, I thought we had to do the same in real-time computing.
Thus, we invented a language called Esterel – It is a joke in French since it sounds like “temps réel” which is ‘real time’ in French, and because Esterel is a beautiful red mountain range near Sophia Antipolis where I worked. It’s very important to give a good name to a language so that people remember it, even if they don’t know what it is.
Darius: That was one of my questions, where the name Esterel comes from, but you volunteered your answer so that question is off, thank you.
Gérard: We have to simplify the problems. People tend to make small problems very complicated, which is particularly true in programming. I was at Bell Labs for quite some time and I was a user of C++. C++ looks simple when you start, but when you develop with it, it gets very complicated, it changes all the time. I decided instead that I wanted to start from mathematics and make a language with mathematical foundations, mathematical compiling, and mathematical proofs of correctness – both for the compiler and for the user. The latter was the biggest challenge, because we were working at the time for Dassault Aviation, who were building the Rafale airplane. They had a lot of real time processes, which were critical since they had only one actual plane – the demo airplane, which you should not crash. It was very sensitive to bugs, and I was very frightened by the fact that most computer scientists and programmers really are insensitive to bugs. They often find that a bug is a normal, acceptable thing. It is true, bugs are a normal thing we easily do in software, but when you fly on a plane, a bug is not a normal thing. It is necessary to be bug free as much as possible.
We had to change the basic hypothesis, and I decided it had to be very simple. Looking at the interference between process time and computation time makes it really complicated. Let’s assume instead that the program computes infinitely fast. Let’s assume the computers are perfect, which is what you do in control theory: when you make claims about a value z at time t, for instance, zt=xt+yt, you don’t count the time it takes to compute ‘+’. So, if we generalise that and assume that we have an infinitely fast processor, how can we program it? Things get simpler: programs run by a sequence of zero-time reactions, where concurrent processes discuss in zero time.
Typically, an airplane computer system computes one reaction every hundredth of a second. That makes for one hundred reactions per second. At each reaction, it reads new inputs from sensors and pilot commands and react in function of these inputs and of its current state. It then computes the new output, does this entire cycle repetitively. It is just a state machine.
People tried to do that with state machines, but large state machines are inferno for humans.
If you have 25 states, it is already a disaster, but we had to deal with state machines made of tens of thousands or millions of states. So we discovered that our synchrony hypothesis was able to scale to the programming of huge concurrent networks of automata. We built our Esterel v2 and v3 compilers by computing the global deterministic automaton generated by the concurrent network, which did work well in many cases but could lead to generated code explosion. Esterel v3 was made industrial and use by some companies, mainly Data Aviation.
A little later than the birth of Esterel, Paul Capis’s and Nicolas Halbwach’s team in Grenoble developed another synchronous language called Lustre. While Esterel was control-flow oriented, Lustre was data-flow oriented, which made the languages quite different although rooted on the same principles. It also had a mathematically defined semantics, but a simpler compiling process (a team in Rennes developed a similar language called Signal, but with less success). Lustre was also developed industrially and used by Airbus. Airbus used two languages for developing airplane control system. One was SAO, a language of their own, the other was Lustre. SAO was complicated with many primitives or kind of C-written subprograms, but Lustre was awfully simple, with basically five stream-based primitives, and it also led to a simpler implementation. It was mathematically well defined, and you could build complex programs out of Lustre subprograms that you directly understand – no C needed.
And we discovered right away that the hard-to-detect run-time deadlocks of asynchronous completely vanish in the synchronous world, since all of them can be detected at compile time, a huge practical advantage. Basically, the compiler has to be quite smart. It is a direct derivation of the underlying math.
Then, when working with people in formal verification, I discovered that I was actually doing hardware and not only software : since synchronous concurrency is the rule in clocked circuits, Esterel was much easier to translate into circuits than in software ! This was done in 1991 with Digital Equipment and then with Intel and Synopsys, which are big circuit and CAD companies, as well as with Cadence Design Systems for system-level circuit design. To everybody’s surprise, we showed that there is no difference between designing a digital circuit and programming an aeroplane, it’s basically the same thing. Of course in an aeroplane there are floating point numbers that you don’t have in circuit outside their specific ALUs, but the control logic is the same.
After having developed a new Esterel v7 compiler based on this new formalization, we started a company named Esterel technologies. We sold the compiler to both software and hardware companies. But we had to adapt the terminology because these two audiences don’t use the same words! That was fun, and our designs performed invariably better than human beings’, for a good reason: humans are incapable of computing big Boolean formulas. We used new techniques such as BDDs (binary decision diagrams) that could do that efficiently. This that made our compiler very efficient but quite complicated because circuits and software control logics are very demanding. Furthermore, BDDs also allowed us to perform large correctness proofs about Esterel programs, which was used extensively for both software and hardware.
This compiler was killed by the 2008 crisis, because most the start-ups specializing in hardware disappeared almost overnight. But this effort continued for software with a mixture of Lustre and Esterel. Now it’s a new language called Scade 6 with a new compiler, also mathematically based. Which is now the standard in certified high security software. It is much easier to certify an application because SCADE 6 is mathematically defined and because its compiler is itself certified as the highest avionics level, which makes it possible for customers to certify only their source SCADE code instead of the generated assembler as usual. The industrial compiler is programmed in Caml, extremely rigorously, and so people who want certification love it.
Darius: Could you put dates on when you started the design of Esterel?
Gérard: The first idea came in 1982, first publication in ‘83, first v0 prototype of the compiler in ‘84, then the first v2 real compiler in ’86, first industrialization v3 in the early 90s, hardware+software by v4 and then v5 in ’91-93.
The circuit view was a big surprise, in fact it was invented when I was in India. I go to India very often for photography and work. There I understood it had to do with the interplay of Brahma, Vishnu and Shiva, the Indian god trilogy! By discussing with my Indian colleagues I realised that that there are three kinds of wires in the compiled circuits. There are wires that start each subprocess (Brahma), wires that allow them to continue (Vishnu), but if you put 0 in the latter, the currently active controlled subprocesses are stopped right away (Shiva). Then you also have wire that kill other concurrent subprocesses (Shiva again). This is exactly what’s happening in the circuit as well as in its software version.
Darius: But the way you describe the history, you started with common programming languages and then you went to control theory and seem to have invested a lot of time in it. In that sense, and we’re talking terminology, so it doesn’t have to be very rigorous, but would you qualify Esterel as being a domain specific language? Or is it wider than that in your view?
It is domain specific language with a lot of flexibility, because it can be used in different areas. For example, Manuel Serrano has developed, a multi-tier web language called Hop. Hop is very nice because code distribution on distant machines is automatic from a single artefact, but it is still completely asynchronous. But there are things you should do synchronously because it is so much simpler, like GUIs. There is already a lot of GUI programming, but all either sequential or asynchronous. But, as a user, you want that to see no time in the GUI, you want it to be synchronous. And Dassault had already implement very complex airplane GUis in Esterel.
So, (from 2017 to 2019) we developed HipHop, which is essentially Esterel in Javascript.example. In 2020, Manuel and I published a PLDI paper on the Hop/HipHop programming of medical devices such as pill boxes adapted to complex medical protocols, which ensure that you take your pill in the correct time window and raises an alarm if you don’t. We looked at the problem, and said ‘yes, it looks exactly like an airplane GUI’. HipHop has a syntax similar to JavaScript and Hop, because people think they understand a language as soon as it has such a syntax, even if that’s not quite true! HipHop simplifies programming of complex state machines and interfaces a lot, for GUIs, complex network events control etc, where event JavaScript handlers rapidly become messy, it is translated into pure JavaScript by an Esterel v4-like compiler written in Hop. Compared to JavaScript or Hop, you cannot program what you would usually program with them, but you can quite easily program things that you really, really cannot program easily without it. For example, when you write a big automaton by any standard mean, whenever you make a change, it is a nightmare, while it is most often much simpler in HipHop.
After my research career, I was a professor at Collège de France, which is the main institute in France with around 45 professors, each having a very strong duty to never teach the same thing twice. The idea is for each yearly course to be entirely new, which is very nice for the students, but as the professor, it’s only nice in the beginning! There, I decided together with my postdoc Lionel Rieg to make a formal proof of the Esterel-to-circuits compiling algorithms. Using Coq, which is a hard but marvellous system for mathematical proofs, he proved correct the compiling algorithms, not quite up to circuits, but very close.
Darius: I guess this is connected to work I’ve seen published; I think was Xavier Leroy who validated a core C compiler using Coq as well for computation algorithms?
Gérard: Yes, Xavier Leroy is an extraordinary guy. He became a professor at Collège de France the year before I retired. Xavier made a change in full compiler verification, a change in dimension. But before Esterel was developed, most of its semantics were developed with one of my PhD students named Georges Gonthier, who won a gold medal in the international Olympics of math. His contributions to Esterel compiling were also absolutely major. Then, he was the first in the world to do huge mathematical proofs, using Coq and developing many libraries and tools for it. When I say huge, I mean theorems that have taken 30 years to prove in the past, such as the 1963 Feit-Thompson theorem on the classification of odd groups, whose proof takes 250 pages in its compact version ! With Coq, and with his team, they developed the proof sketch as a functional program, which was long and hard. Now, you hit return for the automatic proof verifier to just say “OK” !
For CompCert, Xavier Leroy used the technology and the mental change brought by Gonthier, and we use the mental change of Xavier Leroy as well for Esterel. There is now a big project in America called DeepSpec which tries to generalise this to networking, to operating systems, etc. To me, this new view is really a game changer around the world.
Darius: How about acceptance? We language people know that designing the language side and implementing it is challenging, but it is the fun part, and then we bump into people that we have to convince to use the language. How did that go with Esterel?
Gérard: It was hard to convince the classical informatics science society. They don’t care because they want languages that look just a little better than the classical ones. Languages are very hard when they must integrate many different ideas. It is hard to make them work together.
It shows in object-oriented programming. I remember discussions about simple vs. multiple inheritance in C ++. This looks easy, but it’s actually very hard. Object-oriented programming is still not well understood, while functional and synchronous programming fare much better, as we understood the math before proving the language. In summary, with the software people, not much interest in general
But engineers were very interested : now, there are more than 300 companies or doing cranes or trains or planes, submarines, rockets, even car components with SCADE 6. They realize it is important to make good software, which is not really the default case now. Most computer scientists don’t even know anything about control theory, it is an unfamiliar landscape for them, but it is familiar for people who drive complex physical systems. Control theory is a mathematical science in its own right.
Darius: This acceptance by people from the field more than by computer scientists would suggest that this is indeed a DSL. It’s something that is designed for a domain and people in the domain recognise themselves in the concepts that are being exposed and people that are more on the pure computer science side of things do not necessarily see the interest. Is that correct?
Gérard: Yes, but is is more than a usual DSL that tackles a part of a problem, like Yacc for parsing. Yacc changed the world in the definition and use of grammars and EBNF, and of course, the same goes for Lex for regular expressions. We build DSLs to improve on common tasks that you often find in programming but are not easy with classical languages.
Esterel it is a DSL for tasks which are much larger and not common for computer scientists but are common for the engineering community. In hardware, people realised that we could perform efficient circuit synthesis from Esterel using very fancy automatic optimization techniques, with results invariably better than what they could do by hand, and much easier to check because they could perform formal verification with tools integrated in the language’s IDE. And users directly used formal verification a lot, even for software. For instance, there was a large critical program on the Rafale airplane that drives all the maintenance and test systems of the aeroplane, both on the ground and in flight. On the ground, you should move the ailerons to maximal extension several times to ensure they work, but you should never do that in flight, otherwise you’ll crash instantly. They wanted to prove mathematically and automatically that you cannot do that in flight. For this, they had to remove their operating system and original asynchronous task-based program and replace them all by a single big Esterel deterministic test management code. (I forgot to mention that Esterel is concurrent but deterministic, and the same goes for Lustre). In 93, they proved that it was impossible to trigger ground tests when in flight. The converse is not too dangerous, but this was critical, because you could lose the airplane in the process.
Darius: When preparing for this interview, the easy part when talking to someone like you is that Esterel is used by many people. I went through my network, and I spoke with people I know and asked for their opinion about Esterel. You’ll be pleased to know that they pretty much concurred with what you say, that there is no way that those critical systems could have been as safe and as reliable as they were if not for Esterel. And then I asked, for the sake of conversation, what is your biggest complaint? If there is something I should talk about with Gérard regarding Esterel, what should that be? And it is really interesting because they came with two things, and I would like your opinion on those. One is data typing, I really want to understand what is your take on Esterel’s typing capabilities.
Gérard: Typing was crude up to Esterel v7, were we had a complex type system, with automatic bit-level sizing of the variables based on a static algorithm. For example, in hardware, wires and gates are expensive, you don’t use 32 bits integers systematically, but compute the minimum bit size needed by the application. Neverthelless, data typing was still sort of a secondary question for us compared to control specification. But Louis Mandel developed the Reactive ML language, which embeds part pf Esterel info Caml.
Darius: The second point sounds like a non-issue, I know we live in a very imperfect world and apparently minor issues can make peoples life very miserable. Switching to a graphical representation makes version control really complicated. It has always been a fascination of mine that we are in an industry where there is that constant aim at replacing text by pictures and I appreciate the appeal in some cases.
Still, given your experience, how comfortable do you feel about Esterel’s switch (I wouldn’t know for a fact that it is a switch, but more of a tendency) to move towards a more graphical representation and your artefacts then become more and more obscure, which means that version control becomes an issue and then incrementalism and managing differences gets more and more complicated over time, etc.
Gérard: There are several points here. Pictures are important. In parallel to our work, a graphical language called Statecharts was developed independently by David Harel, which was very successful. It included very clever ideas but it was not initially developed as a programming language, but more as a discussion language between pilots (again in airplanes) and engineers, a way for these people to be able to talk around a common specification notation.
Piloting an aeroplane is very, very complicated. Harel created these hierarchical drawings which are really nice, but I think the semantics were not so good. It is not synchronous, it’s kind of semi synchronous with a lot of particular rules. There are in fact about 32 published independent possible semantics instead of one, and the comparison between the versions made people happy to write many academic papers. But the basic idea of the graphics were really cool and changed the domain.
A guy in Nice called Charles André adapted that with Esterel by simplifying the graphics, because there are Statecharts things that can be very complicated and he didn’t want his SyncCharts to be very complicated, he wanted to stay on the safe side. So we also adopted SyncCharts in the industrial tools, which were able to mix graphics and text.
There is no miracle, neither with text nor graphics, but usually graphics are easier to communicate between engineers and especially with specifiers. They can be misleading as well and I think the mixture is good, depending on the people and their tradition. We did something different for version control, based on yet another form of mathematical proof. When you have a complex system and you want to modify part of its behaviour, you may want to show that the other behaviours are not changed. You could do that in hardware design, but also in software components with automatic verification. That was based on the work of all the people who have worked on logic engines such BDDs (Binary Decision Diagrams), SAT-solvers, (for pure Boolean satisfiability), and SMT-based- solvers (satisfiability w.r.t. theories, which can handle data).
Darius: What would you do if you weren’t involved in computing at all?
Gérard: Oh, I don’t know. All sorts of sciences. I’ve been a scientist since I was a baby, I guess. I used to be a chemist, I’d hope to have a very big chemistry or physics lab, but then I fell in love with computing.
Darius: How old were you when you wrote code for the first time?
Gérard: I must have been eighteen, maybe nineteen. That was on a SETI PB250, a very strange computer of the 60s with 22 bits words and magnetostrictive delay-line memory. It was strange but fun.
Darius: At that time code was essentially cable or assembler. Did you have a high level anything to do to work with?
Gérard: There was a high-level language called MAGE, but hardly usable since the machine was too small, and I did program in assembler. But I also decoded the operating system, which was 256 words in machine code!
Darius: Among all the software projects around that have passed, is there one project that you would have loved to work on?
Gérard: Oh yeah, the iPhone touch-GUI and synchronization designs, because they were done by my friend Jean-Marie Hullot, who was my master in UI.
Darius: What do you consider the most important quality for someone in our trade?
Gérard: To be rigorous and consistent, and never to believe that what he or she does is right. Bugs are everywhere and making bugs is no good. Not making bugs is another profession.
Darius: What do you consider your most important professional, technical, scientific quality? What makes you good at what you do?
Gérard: The relationship between math, logic and computing and the way we can apply it from novel formal math to languages and computing.
Darius: And then the clear opposite: what do you consider your most important professional, technical scientific flaw? What makes you in some areas, perhaps not as good as you want to be in what you do?
Gérard: Understanding asynchronous concurrent programming. I was never able to invent in this field. Understanding was fine, but inventing, no, it was not for me. It was not the right mindset for me to do what Leslie Lamport and others did. Instead, I worked on the two other kinds of concurrency, synchronous concurrency and timing-controlled concurrency.
Darius: if there was one programming language you wished never existed. One language you could erase from the planet, which would that be?
Gérard: I think they disappear by themselves. I really disliked the foundations of Fortran. FP by Backus was one of the first languages to disappear fast. I didn’t even have to wish for it to disappear, since it was based on wrong ideas about functional programming and it had no chance of survival. APL was good, in some sense, I did use APL in some early projects, but it was a language with no future, it’s just too hard to use.
Darius: APL has pretty much disappeared by now.
Gérard: Other languages have now disappeared from the mainstream, like LISP, but these were true creators.
Darius: The question is not whether they disappeared because there’s less the matter of observation. Which are the ones you wish would disappear.
Gérard: LISP survived and made a major error with dynamic binding at the beginning, but that was hard to realize it back then. The rest of LISP is everywhere. Functional programming was mostly born with LISP. Not entirely, but mostly.
Darius: If you were to direct our audience to read one book, one article, or one author computing related or not, what would that be?
Gérard: Not computer related, it would be Cannery Row by John Steinbeck. Computer related, but I would go for Surreal Numbers by John Conway. It is a crazy but fantastic construction of numbers, which is really in the spirit of computing. You get the real numbers first and then the integers. It is extraordinary. It’s very much logic for computing.
Darius: Again about books. What is the book you wished you could force yourself to forget so you can read it over and over again?
Gérard: Read it over and over again? Now it’s very hard because the delicate points of computing and programming are not written in books, it is still kind of a know-how. It is like trying to be an acrobat in a modern circus, some people succeed, and some people don’t. It’s not a question of reading or writing, you can read whatever you like and still fail. No, I don’t think I have one.
Darius: My final question, what would you like to be remembered for?
Gérard: There are two things, one which is known in a small circle, which is stability and sequentiality which was the first real theory of what it means to be sequential for a computer program.
Gérard: The second one is breaking the dogma on parallelism, which used to be synonym with asynchrony. There are three kinds of parallelism, as I already said. We broke this dogma with my French colleagues, but that was not accepted in the US, except for Ravi Sethi at Bell Labs and the hardware people to whom it was normal. It was not easily accepted in the mainstream computer science community, because people like dogmas and don’t spend time questioning them and studying their limits. Synchrony is absolutely fundamental, but for other non-mainstream applications such as real-time programming, digital circuit design, and complex GUIS. These are not small subjects!
Maybe also the CHAM (Chemical Abstract Machine), which makes asynchrony formalization much simpler than Milner-like process calculi. This is really my overall goal: simple, but not simpler as Einstein said. And also working from A to Z. If I want something to be remembered for, it is working from A to Z, that is considering that stopping at Y is not an achievement.
Darius: Gérard Berry, thank you so much for being our guest today. You’ve been the most gracious of all guests. Thank you.