Doug Lea

We Speak Your Language S01 E02

Intro: Welcome to the ‘We Speak Your Language’, the podcast for Computer Language Geeks and nerds. This third episode is hosted by Jan Vitek and is brought to you by Raincode Labs. Enjoy.  

Jan: Hello I’m Jan Vitek, I’m one of the hosts of this podcast called ‘We Speak Your Language’ and today I have the pleasure to have Doug Lea.  

Doug Lea is someone I’ve known for many years. He has worked in concurrency memory management and the design of Java libraries. You may know him for his work on memory allocator DLmalloc. You also may know him for his work on the Java concurrency libraries. If you’ve used Java, you’ve certainly used his work, and he’s also one of the nicest guys I know in the field, so it’s a real pleasure to have Doug.  

So, Doug. Let’s start with a simple question. How did you get into the field?  

Doug: I was one of those people who went to Graduate School to figure out what I didn’t want to do for the rest of my life, which initially was what you would call cognitive modeling, as a graduate assistant, I put together their real time experimental labs in the biology and psychology departments and discovered that in those days, what was real time programming was more interesting than the things that I was allegedly studying, and that led me work on infrastructure ever since, for a long time now.  

Jan: How did you get involved with Java?  

Doug: I was working mainly with C++ and collaborating with people at Sun Labs doing some remote applications, some very early versions of digital currency and consensus algorithms and related topics. I was doing all this in this very odd combination of C++ and Tcl and getting really dissatisfied with it. The people I was working with told me I should talk to folks down the hall, they’re working on this language which we don’t think very highly of, but you’re the academic collaborator and you do the crazy parts, so maybe it will work for you. So, I then became probably the first non-Sun Labs concurrent, Oak in those days then Java, programmer. If you’re the only one, then you are also the one with the most expertise, because there’s just you. And thats how I get involved, I thought Java was unexciting and perfectly adequate for what I was going to do, it had language support for concurrency, which was vastly better than anything else I had ever worked on and I was pretty dissatisfied with all the alternatives at that point.  

Jan: Originally Java had only green threads, wasn’t that a downer?  

Doug: Well, you have to remember originally there were hardly any multicore CPUs, so green threads was perfectly fine if all you have is one core anyway. The evolution away from green threads completely matched the rise in multiprocessors. As soon as you have multiprocessors you demand something better for thread scheduling, and there were debates about whether to keep green threads or whether to use an M-toN model. All the people working on the lowest level stuff really wanted to just let the OS do the thread scheduling. Now 25 years later there’s things like project Loom, that say we actually did want green threads at some points and so that’s definitely a thing that’s coming back.  

Jan: It is funny that it went one way and comes back after 25 years. Do you feel like people could have guessed or was the evolution path reasonable? 

Doug: The most recent spur for rethinking things such as green threads is the rise of micro services. At first, you don’t think the two are connected, but what your average micro service does is it opens up connections, gathers information, does something trivial with it and then tells you about it. To launch a whole bunch of IO driven threads, if you do not have any green threads you have to make reactive designs — which I love — but most people don’t initially think they do. My current stance is that everybody eventually does something where they’re going to need to use some sort of reactive callback completion continuation designs anyway, and so it’s nice to have the alternative of having optional green threads in language, but I think it’s sort of a way station for most people. Go has prospered in the age of micro-services because it’s really convenient to do exactly that. It’s basically a simple fork join design, but it all IO bound, it means that you pay a lot of overhead costs by having the OS do it, when you’d rather just have something lighter weight.  

Jan: Coming back to Java over the years, how is the involvement of the open-source community compared to the various companies that have owned the language? How do you see the two collaborating?  

Doug: It used to be a big fight, I would spend a large amount of my time making sure that there were open-source avenues for collaboration and for source code. Way too many politically unsettling things happened whist I was at JCP, corporations are not particularly nice to each other – what a surprise. Making sure that there was a chance for to work on open-source projects was a big chunk of my life for a while, and we just lead by example. At one point we decided we were going to put together java.util.concurrent. We decided it would all be open source, and when all the lawsuits came, thankfully we were specifically exempted. Nobody could make a claim against java.util.concurrent which at that point was so enmeshed they could not do anything about getting rid of it. It was definitely not the property of Oracle. 

Jan: And did you see a big difference between Sun and Oracle in terms of their approach to the language?  

Doug: Sun has always been very open to academic collaboration. They were really fun and pleasant to work with. They’d ask ‘Are you interested in working with people from our labs groups?’ And if you said yes, you didn’t hear from them very much, you just got with people in the lab groups. That was awesome. It was an experience that doesn’t have many analogs these days.  

Jan: In terms of the work on the Java Library, is that still taking a big chunk of your time?  

Doug: It’s pretty stable right now, One of the few things we did right in java.util.concurrent was to make really careful interface implementation splits so that we can rip things out and replace them as better algorithms and techniques going to play. Every once in a while, I’ll pick something up. The most well used value java.util.concurrent component is ConcurrentHashMap. The HashMap is a little weird because it’s a little bit like what the VM can do for monitor caching, in a way we can’t do as efficiently in Java. I’m sure there’s probably a better way to do that and so I have tried to throw it away and re-do it several times over the past few years, and I never have. It’s basically just an evolved version of itself. I overhauled the basic locking primitives using AbstractQueueSynchroniser a couple of years ago. That’s the kind of thing I’m constantly doing. But I don’t have any ambitious plans to do something really vastly different, in part because the people working on Loom need a chance, to do at least  that use-case well, and there’s actually a lot of things that we need in java.util.concurrent just to make things better for them. But it does mean that if you’re doing something that is structured parallelism, fork-join, create trees, bags, and it’s computationally oriented, you would use our software. And if you have basically the same design and it is IO oriented, you use Loom and that’s not a very pretty state of affairs but that’s what it will be like for a while.  

Jan: I remember for a while there was a push towards transactional memory both at the software and hardware level, is that something that you think has died or will it come back again?  

Doug: I’m going to stay agnostic about it. When too many people get excited about something I usually decide to do something different and that has always served me well. Even though I do work a lot with other people I tend to produce code by myself. Working with large groups of people, researching, and getting 10 publications a year in PL journals is not something I’ve ever wanted to do, and so I don’t. I don’t really have any strong thoughts about transactional memory, I think something transactionalish is here to stay, even if only it is wide-CAS. It is a poor man’s transactional tool and it’s surprising how much you can do if you just have twice as many bits, you can do in an atomic operation, in theory would be better than if you had n cache lines. 

Jan: One thing you just touched on was publishing and you have a somewhat unusual position because you’re the most practical academic that I know, but you’re still an academic, how does that come about? 

Doug: I’m way too old to say this, but I still don’t know what I want to do when I grow up! I think I want to be in a place where, in principle, I can just do anything I want. However, in practice, especially in the past few years I can do anything I want, so long as I still deal with the crazy crisis of of charing a department during the pandemic. I traded freedom for cash and I don’t regret it.  

Jan: I often talk to students who say they don’t want to go into academia because they don’t like the pressures of publishing. These students often say that they feel industry or industrial research is right place for them. Do you have advice for those students? 

Doug: I wish I had more compelling advice because I often tell people that if you are able to find a position like mine you will have a great life. But then they don’t try.  I think a part of it is that it’s a perhaps a little bit off the radar of PhD advisors, it doesn’t fit well how PL and Systems PhDs tend to go, where they are really focused to either work on your thesis and getting lots of internships. Or trying to put together five top tier publications whilst they’re still a grad student. There really are people who do that! And I’m like completely impressed, but these type of people do not want my job.  

Jan: There is definitely a lack of advising. I think some of my students went to positions that are somewhat similar and they were really happy, but you don’t tend to hear from them as much as the ones who continue to attend conferences and continue publishing.  

Doug: I agree, and It’s an incredibly rough, for first few years to come to a place like SUNY Oswego. If you come from a place like Northeastern University, for example, because you will have insufficient training in how to teach core CS courses, which is pretty hard to do. But once you get past that, then then things become much more open, but it is a little traumatic. We hire people and they really need a lot of encouragement and happy thoughts during their first year or two because they are just thrown to the wolves, but nobody mentioned to them what life would be like teaching a CS two course, so it is hard. 

Jan:  In terms of new developments in programming languages or general topics in our field, are there things that still excite you? 

Doug: That’s a good question. I confess to being increasingly in the  “all those languages are the same” camp. I don’t know if that’s just the current state of the world or perhaps just maturity, but these days I think of C++, Java, .net, Rust, as all being in the same ballpark with different emphasis on how you should do stuff. I think if I were to be working in something that is way more targeted, such as machine learning or probabilistic programming, I would be excited about the language per se. If you’re working with probabilistic programming, it’s only the language that matters, everything else is just just implementing lots of Bayesian loops and things like that.  

Jan: How do you see the future of education? In the sense that now that we have tasted the Zoom revolution, do you think that all of our teaching is going to end up there?  

Doug: Hopefully not mine. I don’t think I am a very good Zoom teacher, I don’t think I’m terrible either, but I don’t know how to do the equivalent of watching a student with the tense nod of the students that look like they have a question but don’t even know what it is.  I am stopping class 10 minutes early because the students are clearly saturated, which I do a lot – for example the concurrency course, even though the class is an hour and 20 minutes, an hour is all you can handle about memory models at one given time. We do have students here who are as good as anywhere when coming out of these kinds of courses, and I don’t know how to replicate that experience. At a State University College, our student distribution is different than at, let’s say, Northeastern University. We have some good students, but the distribution is completely rectangular, so we have people coming in from all sorts of likelihoods of success.  It’s a fun challenge. I don’t hit that as much as I used to because I mostly teach 3rd and 4th year courses and most of them by that point are not likely to just disappear for half a semester, but sometimes they do.  

Jan:What would you do if you weren’t involved in languages and software? 

Doug: I think I would be building something, and I say that because that’s my straight answer. I say that because that’s my ancestry, there were people who worked as cabinet makers, train engineers etc. Statistically, I think would be doing something that’s on the side of engineering  

Jan: How old were you when you wrote code for the first time?  

Doug: Does writing proofs in a logic course count as programming? If yes, I was 18.  

Jan: What software project would you have loved to be a part of? 

Doug: Pass! Too many of them to answer. I could make a case for three or four or five, and as I keep thinking I can pick up more. I don’t know, too many of them!  

Jan: What is the most important quality for someone in our field?  

Doug: Self-knowledge. The notion that you are going to try very hard, and you are still going to make mistakes. Basically, amenability for QA at a very informal personal level.  

Jan: What do you consider your most important quality?

Doug: Openness I think.  

Jan: What would be the most important flaw? 

Doug: That goes hand in hand, the inability to say no. 

Jan:If there was one programming language you wished never existed, you could erase from the earth, which one would it be?  

Doug: It would have to be C. Even though I don’t think C is the worst thing ever, but if C had never existed and something else came to rise in the 70s and 80s that did not have as many opportunities for address arithmetic and things like that, it might be a better world we live in today.  

Jan: If you were to recommend one book or article to read, what would it be?  

Doug: Well, the easy answer is Knuth. If you’re going read a book you might as well read it.  I’ll leave it at that I don’t really have a good answer.  

Jan: I must confess I’ve never read it. It was in my bookshelf just here and two days ago I emptied a third of my bookshelf and that was one of the books I got rid of. 

Doug: I do confess to reading it the wrong way. I tried reading it sequentially at first and that’s just not what you do. If you find a problem, you head for that and then you start reading all the surrounding information, because this becomes interesting and so that’s how I personally read it. Although there are probably 40/50 page segments I’ve never read, even though I think of myself as having read all of it. 

Jan:  What would you like to be remembered for? 

Doug: I think the only answer someone like me can give this question is to make the world a little better. We build, we write, we teach other people to build. Hopefully the things we build aren’t making the world any worse, sometimes I don’t know. If you were talking to me in the 90s you would be talking to me as someone who was equally interested in distributed and concurrent systems, and I’m still interested but less active working in distributed systems. In that sense, I’m not sure anything I’ve done working in distributed systems would have made the world better or worse. But on the other hand, nobody stopped anybody from you know wasting countries worth of electricity on consensus algorithms.  

Jan: Concurrency is hard already, but when you you sort of add partial failures…  

Doug: I’ve never been of that opinion. In fact when I teach I say the difference is different administrative policies. In the concurrent system, things fail, it’s a big deal, you just don’t have different administrative policies and other external forces that make your life so hard. A lot of it is not so much that, it’s just trust. It’s over the years I’ve realized how ignoring the basic premises of secure systems was such a mistake 30 plus years ago and I’m to blame, me and everybody else. 

Jan:  Well, I guess that’s all the time we had for today. Thanks you again, it was great to talk to you.  

Doug: Same here, thank you for the invite.  

Outro: We hope you enjoyed this episode of ‘We Speak Your Language’. Stay tuned for next month’s episode with Manuel Serrano to talk about his work on ahead of time Javascript compilation. Have a great day.