S01E13: Brian Ford & Evan Phoenix

We interview Brian & Evan and talk to them about what the future of Rubinius looks like Danish: So today on Cloud Out Loud we have Evan Phoenix and Brian Ford with us, the guys on the Rubinius team. So can you guys introduce yourselves please?

Evan: I’m Evan Phoenix. This is the sound of my voice. I’ve been working on Rubinius for the last 4ish years I guess, which sort of started as a hobby project and then Engine Yard Brought me in to keep working on it.

Brian: Hi this is Brian Ford. I’ve also been working on Rubinius for a few years and--with Engine Yard and one of the notable projects that we spun out of Rubinius is RubySpec.

Danish: Very cool. So, it seems to me recently that a lot of Ruby programmers are synonymous with Ruby is a really happy programming language and makes them happy compared to one that they talked about previously so I’m curious what you guys think about that and whether you kind of believe in that philosophy or not?

Evan: We definitely do. I think we’ve--we go--really early on in the project we wanted to make it more user-friendly. So we--its funny, one of the simplest things that we did that was really just a lark one day was, I was debugging some code and it was very difficult to figure out what was going on. And so I added this thing where the back traces would be all pretty formatted and have all this other information and that was about a 15 minute thing and for years that was actually the thing that Rubinius was known for. Like, “Oh, the back traces are amazing” so it was this very simple thing but I guess its those kind of things that we try to find that people really like because those sort of neglected parts but they interact with them all the time and when they have to interact with them, they’re always sort of sad. Like back traces--if you see a back trace its one of those like, “Oh crap, now I have to go figure it out” so its nice, its pretty, and it gives you a lot of information, that makes them--that seems to make programmers a lot happier. It definitely makes me happier when I’m working on the code.

Brian: So interesting, related to the back trace. So the very first Rubinius hack-a-thon, in Denver, and Wilson Bilkavich basically added that you’d see color codes for back traces--so we had colorized back traces, which was all in Ruby. Did it in about 30 minutes, I think, maybe, in the middle of--we were writing a ton of specs and doing all kinds of sorts of other stuff. But that was pretty interesting; it was one of those like, very early victories. Here we were, writing in Ruby, implementing Rubinus, adding something that made it really nice to deal with all the errors we were seeing as we were slowly implementing. We were--the thing about Ruby itself that makes me happy typically is, I’m happy when I’m accomplishing my goals and there is a very low level of frustration. And my experiences before with Python, with C or Tickle even, was that when I would go to do something the distance measured in time between when I would start trying to do something and when I would succeed was longer with more frustration. And when I started using Ruby, I found that I was writing code that expressed my intentions that worked and I had very little frustration and I think a lot of that has to do with 1) the syntax of the language 2) the facilities that are available--Array has a very nice API and you can do a lot of stuff with it. And then just, I think it big measure it was the Pick-Axe book--programming Ruby from the pragmatic programmers--tongue twister. There presentation on the first tutorial section of that book was immensely helpful.

Evan: We do a lot of--or I like to do a lot of, going up to normal Ruby programmers and trying to see what they’re work flow is and ask them like, ok what about your work flow frustrates you? What are the things that you don’t really like? Because that’s how the back trace thing came about--was me being frustrated with work flow problems. So I like to talk to people who don’t even--aren’t really using Rubinius, maybe don’t even know what it is but are using Ruby just to find out what could we do to make that work flow better. So that’s something that I like to do often.

Danish: Ok. So, I’m curious then, what brought about the Rubinius project? It seems like you guys were really big on Ruby, Ruby was really useful for you guys. Why go out and decided hey, I want to create this new, you know kind of thing on top of Ruby that’s supposed to kind of rival it and you know, I want people to start using it even though I really love Ruby.

Evan: Sure yea. A little history. So, I--the--ok. Back in 2004, I hadn’t been to any RubyConfs yet, I was just sort of--I’d been doing Ruby for maybe a year or something like that but I had done C and C++ and sort of Perl in my background. And people had said, “Oh, you can’t add neothreads to 1.8....that’s actually--that was banted around and I wanted to know why that was. And so I started this long--this is a project that I was working on at home, that thing to keep sane from work because work was getting just really crappy at the time and I needed something that was sort of mentally engaging. So, its not going to sound mentally engaging, but I can assure you it was at the time. I decided I would go through and I would clean up the source for ⅛ so I actually went through and went through almost every line in there and basically made it so that it didn’t use a whole bunch of globals everywhere. The idea being, ok if I want to have threads, they can’t be using globals because then the execution won’t work, they’ll just stop on each other. So I went through and kind of cleaned it all up. At the end I had something like a 10,000 line diff and realized very quickly that--so at the end of this project it stopped because I realized that there was a whole bunch of fundamental things that would not allow me to add native threads any further. It wasn’t just a clean up thing, it was a semantic thing. So I got done with this 10,000 line patch and realized that it didn’t add any value, in fact, it just made stuff slower. It wasn’t really all that useful. So i sort of just deleted it--that was a projected called Sydney at the time. So I just sort of left that, i didn’t really go back to it and then a few months later I kind of--I still liked this idea so I kind of got the idea, “what if we started over from scratch?” So, I’m not going to design a language I’m not going to worry about the semantics or or even the syntax. I just want to say the ground underneath those syntax and semantics--build that back up on firmer ground if you will. So that’s kind of how it started. It started just as a hobby. I ended up getting some old smalltalk books and seeing how the original smalltalk VMs were implemented and just sort of copied that. And the first version of Rubinius was all written in Ruby it ran under 1.8 which was 1.8.3 or 1.8.4 at the time, it used Ruby in-line to access raw memory so it had--there’s was a garbage collector that was written in Ruby that would use raw memory to actually write out the object space--it was so slow but it was one of those things that was just really fun because it showed that those things could be accomplished. If you want to go look, those--that code is still in the Rubinius code base. It’s back, all the way at the beginning if you go look at the commits. One of the first few commits is all of that Ruby code to build--that is a Ruby VM in Ruby. So that’s kind of how it started. We could just--I had sort of hubris to think that I could do it better but it was also just sort of fun. I had always been interested in languages and more specifically in how languages run, the run time of the languages so this was a really good sort of fun project for me. I worked on it for a while and then, I think at RubyConf 2005 in Denver?

Brian: 2006.

Evan: 2006? So I’m off a year at this point. So, 2006, I decided--I got accepted to give a presentation about it and so I was real--that’s sort of when things got really real at that point. Like, ok people are going to want to see this thing--it can’t just be the thing that lives on my laptop, I have to actually show it to people so I started and I converted all the Ruby parts, simple Ruby parts, into C so that it was at least an order of magnitude slower instead of like 5 orders of magnitude slower. And that’s kind of how the people got interested and it kind of took on life from there because one of the big emphasises has always been write as much of it in Ruby as we can. Like, bootstrap it--if you can write it and the emphasis has always been to say if you can write it in Ruby, you should be writing it in Ruby. And its only in those very specific cases where you decide, no actually I need this in C, that actually you go out and extract the one piece and you do that one piece in C. So, that was very intriguing to Brian and Wilson and a lot of other people that were there and they kind of went from there with it. So that’s kind of how it started. And, its just kind of had a life of its own and taken on steam and people thought it was cool and Thom thought it was cool so that’s why I’ve had 4 years to work on it.

Danish: Nice. So, I don’t know if you guys heard about this but I know when Matz came to San Francisco I didn’t go to the talk but I heard afterwards that he kind of briefed upon saying that he might use some parts of Rubinius or Ruby too or that he wasn’t opposed to it. So I’m curious if you guys did hear about that if you might elaborate on what he was actually talking about. I know when I heard that I was pretty surprised.

Evan: I think I did hear about that, just I wasn’t there. Either I just had some friends that were there--its hard. I don’t know exactly what to make of it. I’ve certainly--I mean Matz and I are friends so I welcome him to use as much as he would like. Part of the project has always been, you know its BSD licensed so I’ve always sort of been of the mind that I love working on it, I’m going to try to make it better but I’m fine with people taking whatever they want from it and doing whatever they need. So, if he wants to do that, I’d be more than happy to figure out where we go with it. But, I don’t know if he has any specifics other than maybe he just thought about it, so I haven’t heard any specifics.

Brian: Yea, I was at one of the talks. I think he did a couple and one of the tweets that I saw actually came out from his visit to VMware so I don’t know what happened there. But, at one of the talks he did here, the question was basically about the architecture of what he’s calling RITE or Ruby 2.0 and it sounds like the architecture with the new virtual machine, native threads, and also the possibility of using the restricted set of the normal Ruby libraries so you can make a smaller implementation that’s more suitable for an embedded environment, something like a TV or a set-top box, something like that. So, there’s--sounds like there’s, in the architecture Matz is imagining for 2.0, there’s a lot of similarities to some of the approaches that we’ve taken with Rubinius. But I think that would be a good question to explore. I mean, so far I don’t think we’ve seen any code for RITE or Ruby 2.0. What would be interesting is to sort of try to understand what Matz’s goals are and see if we couldn’t demonstrate the viability of some of those goals by taking what we’ve got in Rubinius and packaging it up or showing how it could do what Matz is interested in doing.

Evan: You know but also, Matz aside, we’ve had Charlie on the JRuby team and the MagLev guys have actually used--the MagLev guys have actually--Charlie has played around with basically using part of Rubinius inside of JRuby and I don’t know where he is that but we’ve talked about that many times about like, oh it’d be nice because--the nice thing about having all that stuff through Ruby is that its sort of Agnostic. You know, its not coded in some lower level language so its easy to pull it through another thing. You just sort of have to establish what the sort of primitive operations are, right? So, I know that the MagLev guys, we talked about, a few times, when they were earlier on in the project they were basically just using a lot of the Ruby code we had and a lot of it got pulled in, I think probably they’ve customized it and done it up in the way that they want but Genesis started with us discussing like, “yea just take whatever you need” and you know, go from there.

Brian: Yea, they value to the Ruby community of having an established set of Ruby libraries is huge, right? I mean Java is not known for having all these great libraries that are written in C++, right? So its sort of ridiculous and the situation we have right now in Ruby is if you have a library or if you have something you want to make--Nokigiri is a good example. There’s a C extension for Nokogiri, there’s a native Java port, I believe for Nokogiri, and is there some Ruby code in there as well?

Evan: Well yea, let’s not go that far. There’s--because when that started, Aaron wanted to use libxml2 to be doing parsing. So there’s a lot of Ruby code in there too. There’s interface code too libxml2 to do the heavy lifting.

Brian: So think that the next sort of frontier for Ruby in many ways is to create these really high quality libraries that exist in Ruby, that are coded in Ruby and that usable in any implementation as opposed to having to deal with interfacing with some other library. And there may--you know, XML may not be a good domain for that but there may be other ones. I think the maturity of the language will be in part measured y how many libraries are written in Ruby itself and how infrequently we have to go out and interface with other code. And I think that there’s an interesting objection to that which is that if the code already exists in this other library, why would you do it differently, why would you try to re-implement, why not use it? And one of the responses that I would suggest to that--there’s an interesting guy who blogs at Alarming Developments, that's the name of his blog, he recently posted about making a Darwinian licence. Basically, that he’s going to license his future software under this licence that makes it so that you can’t use it after a certain amount of time. His assertion is that because significant software is not re-written often enough, we’re burdened over time with the mistakes that were made previously. So, I think that's a very interesting perspective--that we learn everyday and we wouldn’t necessarily write a program today the way we wrote it 3 or 5 years ago. So in less, we’re actually trying in Ruby to re-write some of these libraries and do it with modern ideas and in a way that takes advantage of Ruby. I think we’re always sort of tying ourselves too much to the past. There’s definitely a balance but I think the idea that just because a library exists means that you need to use it is not the whole story.

Evan: I guess the last thing I’ll say on one of the upsides of having Ruby--us using Ruby as much as we do and implementing a lot of the core parts in Ruby, I’ve had people come to me and tell me--this is before, I had never even mentioned this idea, people were doing this on there own. “Oh, well when I want to figure out how a particular method works in Ruby, I don’t bother to look at the Rdoc anymore, I just pull the source up in Rubinius that’s in Ruby and I just figure out, ‘oh ok if I--this method takes 2 arguments and an optional 3rd one and if the 2nd one is a string then it does this behavior’”and they can just read it in our Ruby and figure that out and that’s a huge upside because now that code is the documentation that a normal ruby programmer could read and that has not been true in the past. Either they have to drop down to C and they have to sort of extract out what information they get from that C code, now its in this language that they understand and like and they can get a lot further along with that.

Danish: So I know about Rubinius 1.2 came out recently so I’m definitely curious, you know, what are you guys working on now for the next release. Do you know what’s kind of coming in the future?

Evan: So, we’ve--the next big release, we actually had 3 things that we--very specifically--that we wanted to get out and originally we weren’t sure if they were all going to be in one release and it looks like now they are gonna sort of be in one big, big release. We don’t know what number that’s going to be. It might be 2.0, it might be1.5, I’m not really sure yet. But those are Windows support so we’ve sort of haven’t had Windows support because Brian and I and the rest of the community really didn’t have enough Windows knowledge to bring it up to date up til now.

Danish: So is this because of Dr. Nic kind of?

Evan: That didn’t hurt. That’s for sure, that didn’t hurt. So, with the support and you know, Brian’s doing a great job on that, basically slogging through and figuring out where we have a lot of Unix-isms and figure out how to make the code agnostics so that it can be used on Windows as well. 1.9 support and codings and syntax and the whole kit and caboodle for 1.9. The third one is real true concurrency. So, previous to now, Rubinius has had a global lock, sort of like 1/9 does so its used the inner threads but had a global lock that prevented code from actually running synchronously. So if you had 2 threads that were say, doing some math operation, only one of them would actually work in lock-stat because they’d actually have to be holding the lock in order to do the work. So we never really liked that--we always wanted to get away from the idea so we’ve really finally just bit the bullet and started on the process of tearing that out and replacing it with the ability to run those things concurrently and that means basically protecting the rest of the stuff with locks and going through and making the whole thing concurrent. So, those are sort of the 3 big ticket items for the next big release.

Brian: Yea, I think this is one of the areas with Rubinius that is interesting. So, for a couple years now, we’ve been working on essentially, infrastructure. Making sure that the architecture of Rubinius is really solid. And I think one of the interesting things for me is Evan’s been doing all the concurrency work but I mean, when you initially mentioned that you were doing it, it seems like it was about 3 weeks of work--like heads down work and then most of the specs were running. And then over the last month or so, intermittently fixing some concurrency issues but now we have, like all the specs that run on master and are also running on a concurrency branch Hydra--we’re not seeing any significant thread issues after the recent work so the full concurrency in Rubinius is something that is going to be a great advantage to developers and the amount of work necessary to actually implement the feature was not tremendous given how much work went into the infrastructure.

Evan: And maybe I shouldn’t say this in a public forum like this but I kind of did it--I have a bad predilection for doing things on a dare. So, the concurrency stuff actually kind of came as--we were on the IRC channel one day and it kind of came as a dare, like “I bet you can’t add concurrency.” It wasn’t a direct dare like that but it was basically like--it got me, we were talking about it and it got me thinking and I was like, “Well, I’ll just sort of do it as a spike and see how--how far can I get in the shortest amount of time” and if I can get a significant distance in say, a few days, then I know that its worth doing now and that’s sort of what happened. The architecture is always organized around--because we wanted to make it able--we don’t have this yet but we wanted to make it possible to embed Rubinius very cleanly in another application. That’s a feature we haven’t gotten to yet but I laid the foundation for that at the very beginning so because of that, that exact thing spilled over into the idea that it doesn’t--has almost no global data. It uses everything that’s thread global, it passes almost the entire state down the call-stat is doing things so all those things we had been planning to use all along sort of came to fruition when we actually wanted to make it concurrent.

Brian: The interesting thing about that--on a dare, the JIT almost came along very much in the same way, right? It was like Rubinius uses llvm, the Low-level Version Machine project to generate native machine code and the advantage of that is we can get 2 or 4 X, so 200 or 400% better performance and there’s a lot of work left to be done on it but its very, very valuable. Its the thing that will make Ruby very fast. And a couple years ago before the current implementation, we had talked about in IRC and stuff like inlining like, “oh man, how do you inline methods?” so one day Evan just is like, “I’m going to go ahead and try inlining” and by the end of the day it was essentially working. Of course there were bugs to work but it was like, it was sort of on a dare and its an extremely valuable feature for Ruby, extremely important feature and it basically landed in a couple of days because Evan just thought, “You know what, I’m going to try this and see if it works” and it worked.

Danish: Great. Well thank you guys. I really appreciate you guys taking the time to do this podcast for us.

Evan: Thank you.

Brian: Thank you very much.