podcasts

S01E09: Tom Preston-Werner

05 Feb 2011

We interview Tom Preston-Werner of GitHub and talk to him about entrepreneurship and their new conference CodeConf Randall: As it stands, for this edition of Cloud Out Loud, we actually happen to be sitting here in the lovely GitHub offices and we have roped Tom Preston-Werner into discussing all that’s new with GitHub for us. So, thanks for showing up and I hope that the glass of whiskey sitting in front of me didn’t have anything to do with it.

Tom: It didn’t have nothing to do with it. I’m not gonna say how much it had to do with it. But I do appreciate you bringing the whiskey by. Its always a pleasant situation when that happens.

Randall: Bribery always works. So, tell us what’s new at GitHub. Explain to us what’s going on. What do you guys have on deck. What are the things that are exciting for you at GitHub right now.

Tom: We have a bunch of stuff in the works. I am working on redoing all of the search functionality throughout the whole site. So I’ve been spending a lot of time learning solr and the intricacies thereof so that I can go in there, figure out how the indexes need to look, what kinds of things we need to be indexing, get the indexing strategy a little more streamlined so that we’re not missing big chunks of the index and really just rethinking the whole strategy of search.

Randall: What targeted--what was the specific thing made you decide that you needed to rework the strategy of search.

Tom: It’s just--its been not so great for a long time. I think the catalyst really was over Christmas vacation I was home in Iowa and I was reading through some of the Tweets and someone was like, “Boy, GitHub search sure is bad” and I was like, “You know what, GitHub search is bad” and when that happens then someone on the team takes it upon them-self to do what’s necessary to remedy that situation. In this case, we had been talking about it for along time and we said, “Well, Search is one of those things we want to get better in 2011” and I said, “Well I’m not doing anything right now at home so I’m gonna pick up the solr book that came out not too long ago” and its recent--for the most recent version. And I just sat down and I read it cause I thought, “well, this is kind of interesting”--I mean search--I used to work at Powerset, a search company, so I’m familiar with search stuff so maybe it should be that does it, right, in that I know what some of the terms of search mean and how that stuff works and what not. So I sat down and read the book and I just started working on it and so far so good.

Randall: So why did you guys pick solr? I mean considering ok, bout to start the flame war--this is almost like eMacs versus VIM and everything else or TextMate, but ok why did you pick solr as an inverted index in a search solution as opposed to any of the other various, equally valid search options out there.

Tom: It wasn’t actually me that chose it, I think Scott initially chose it. We were--I think the main choice was between solr and sphinx at the time. Those were the two--to us--the most prominent choices for open source search solutions. And this was 2 years ago, I guess, that we initially put search out there, very early in the company’s history and we don’t have a lot of money to throw behind fast search or Google search boxes which are like, insanely expensive, etc. so we wanted to use an open source one--we figured we could make that work. And, at the time, sphinx did not do incremental index updates. You had to roll a new index every single time, which for very large document sets like we had, it was just not practical. Now, I’m pretty sure sphinx has incremental index updating now, but at the time it didn’t--if I am informed correctly on what the decision behind that was but I think that’s why we made that decision. Solar was--a lot of people used it--lots of businesses, lots of enterprises use it. Its like, well if it works for these huge companies with billions of documents, we should be able to make it work for our data set which is significantly smaller than that, although not for long.

Randall: Nice. Well actually, that brings up one other question. So, are you actually using the raw Java interfaces for solr or are you using something else like JRuby or some other scripting interface.

Tom: We use--we run solr, just the regular way through jetty and then we use the solr ruby bindings for Ruby and then that just--you assemble the search query and send it over the wire using solr ruby’s interface and then it comes back as just a set of hashes and we take those hashes and we format them in the proper way for output.

Randall: Got it. So do you have any problems with things like memory inset--like essentially the representation of that data set in memory. Can you stream it from the Solar server?

Tom: It’s not streamed but the document sizes that are coming back are usually small enough that it doesn’t really matter cuz if you’re searching for the code search, for instance, you’re indexing individual files and we have some cut off which I believe is 50 KB in size that right now code search just won’t index at all. It just says, “This document is over 50K, don’t even bother” now in the future I’d like it to just truncate it at something like 50K or maybe go up to 100K but as long as you don’t have a lot of those documents coming back in the search results then its not really a problem. In fact, we probably don’t even need to send back the full bodies of the files themselves because really all you need is the snippet of what matched--matched highlighting and I just need to figure out how to turn off sending back that whole field on return because it returns that field by default because it’ll return any field that you want it to highlight by default and so I just need to figure that out. So really, it shouldn’t be a problem cuz all you’re getting back is some metadata about the thing that you’re searching on and then the highlighted snippets that it found so the weight of that chunk of data coming over the wire is really very small.

Randall: So if you--when you start looking at the other problems with search, specifically what problems in search with code and GitHub like scale, specific types of indexing, different types of files, parsing, lexing, that sort of thing. Have you run into any challenges that definitely sort of fall outside of the regular range.

Tom: Yea, well indexing code is a special kind of thing, It’s not like indexing prose because you’ve got all these special symbols and you have to decide well, are we going to index periods, are we going to index curly braces, are we gonna index hash symbols, etc. Normally in the English you’re not gonna do those things; you just throw them out--you don’t tokenize them at all. So we have to decided what is the appropriate way to tokenize those things to make the searching work the way that you expect it to work. Matching on more literal sets of things because if you’re searching for--I don’t know what a good example is but if you want to search for “something dot something” then traditional tokenizing is just gonna take that period and throw it away but it is semantically meaningful in the sense that you want that chuck of those two things together with a period between them.

Randall: It could be the difference between a class and a class method.

Tom: Right, exactly. And you want that to be a relevant thing. You want it to notice when that happens as opposed to if you throw out that information then its going to match them because they’re close together but its not gonna care that there was actually a period between them. But that is semantically useful information so now its--those are choices that I haven’t really made yet that I’m still going through. Trying to figure out what the best tokenization strategy is for code. And there’s other things like porter stemming which is a common type of stemming so that you match--if you wanna match--if you’re searching English and you type in “run” you would also like to potentially match “running”, “ran”, those types of things. Porter stemming’s not going to match “ran” but it would match “running” and if you type in “running” it’ll match “run.” It basically reduces each word to the most basic component. It would take “running” and it would index the word “run” instead and as long as you do that on the query and what you’re indexing then those two things will match. So another question is, do we do porter stemming on code? And that’s not something that most people think about but in some sense, that makes sense. Like, if you’re going to match something that’s in a comment, then that’s English.

Randall: Right.

Tom: If you’re going to match something that’s in code probably you don’t want to do stemming but at the same time, maybe you typed it a little off but that it would stem the same and maybe you do want to match that. I’m leaning towards not doing porter stemming. This is probably not something that is that interesting to most people but it is interesting to me right now and going through those kinds of decisions as how do create and craft an index in the right line.

Randall: Well no, but that’s kind of important right because now--if you think about it if I’m searching for comments in C code versus C++ code, multi-line comments versus single line comments versus Erlang shell scripts--cuz, actually right now do you have idea what the stats are for like language distributions on GitHub? I know for a while you guys were keeping track of which languages were most accurately represented.

Tom: Yea. So JavaScript just recently became the most popular language, followed now by Ruby. Now part of that might come from counting JavaScript code more than it should be because people almost always include the JavaScript libraries in their code bases so you’ll see JQuery included in a lot of things and we do take measures to not count popular known JavaScript libraries in the total code distribution of projects when we count it but that’s not perfect. So JavaScript is generally going to be a little skewed up--but I can look it up and tell you the exact break down--JavaScript 18%, Ruby 18%, Python 9%, Perl 9%, C 7%, PHP 7% and then a handful of other ones. That’s the basic breakdown as far as we’ve automatically detected.

Randall: Got it. What’s fastest growing? JavaScript?

Tom: I’d say JavaScript is probably fastest growing. Ruby has been number 1 since we started and JavaScript overtook Ruby so I’d have to say that probably JavaScript is fastest although Python is growing very rapidly as well.

Randall: Oh interesting. Is that because you guy have been doing a lot more outreach to the Python community? I think--Is Chris just Pycon or-?

Tom: No, although he’s been going to the Python conferences and Scott’s talks at a lot of the Python conferences as well. We’ve been doing outreach to Python and JavaScript pretty heavily. We go to those conferences and sponsor those conferences as much as we can. I think its just the general trend of dynamic languages are coming to get--they’re discovering as people go along, a lot of those projects are on GitHub already and they see them and they say, “Oh this seems really cool”--JQuery’s using GitHub. If its good enough for them, its good enough for me. It’s really great to get those big projects. Using GitHub drives a lot of those customers to us.

Randall: So you know of the things you guys actually sort of promote as the tenet of GitHub is social coding. So given that you’re actually outside of the Ruby community and I think Ruby was one of the first places where the idea of socially getting together to actually communally write code, sharing code, drink ups, meetups, all that sort of stuff, sort of really took off as an actual value of the community, what would you say are some of the differences or similarities between say, the Python social coding community or JavaScript social coding community compared to what you guys have seen in Ruby or ErlLanger, or other--?

Tom: The Ruby community and the way the Ruby community interacts is always kind of my model for how it should be done. It’s my favorite by far of all the conferences and all the events that I’ve been to. The Ruby community to me has always been so open and so interested in bring the language forward and bringing forth new ideas and embracing new people in a really nice way. I can remember very distinctly when I was learning Erlang, it was not that way at all. I would go into the IRC room sometimes--I Just wanted to know how I could write--how I could get the PID of a running process--just get the PID so I could write it out. And I needed to do that because I wanted to monitor it with god and I needed to know the PID file so I could have it under god’s control. And I went into the chatroom and I said, “Hey, how do I this” and the initial first response I was, “Why would you want to do that?” and I’m like, I could sit here and explain it to you but I really shouldn’t have to. I mean, its just like, people need to do things like that--I don’t know it seems like a pretty reasonable thing to me. But that was always kind of like, “Why would you want to do that” and I can sort of understand that in that they want to maybe guide you and understand why you would want to do something specifically which is the question--what are your motivations, although it always came off as aggressive. And some of the people in there were just not nice people.

Randall: No no no, actually I remember learning C++ the exact same way. You drop into a C++ room and people would basically post your code knowing that its bad but instead of sitting there telling you why it was bad, mostly you’d get peels of laughter followed by essentially abuse heaped upon your shoulders.

Tom: Yea and I think that’s terrible. And Earlang honestly has gotten way better. I go to the conferences and stuff and their community is really nice now. I think it is because they have so many new people coming in and they understand what its like to not know that language and so they generally treat people well whereas things like Erlang and C and these things that require a little more understanding of complex ideas moreso than things like PHP which were kind of built to be understandable and don’t do a lot of the really intense craziness like concurrency and hard core memory management that Erland and C do respectively. If you don’t have new people constantly coming in all the time in droves, then you can fall into that easy place of being--

Randall: It can become a haskel?

Tom: Yea, exactly. Its just a natural tendency for those groups to become insulated from new people just because they like the way they are and they like being in that niche and having more knowledge than other people. Having new people come in all the time changes that and that’s what Ruby has basically always had since Rails came about and even before that--everyone was a beginner in Ruby and Matz made Ruby--the favorite saying I always had and people used to use it and they don’t use it so much anymore but there was an acronym MINASWAN which stands for Matz Is Nice And So We Are Nice and that was one of the big things that people would always throw around if someone was being a jackass in a chat room. They would say, “That’s not how we work here because that’s not how Matz works” and that really defined the Ruby community and I always loved it for that.

Randall: Nice. So actually, that’s a good point. I mean I think a lot of people have gotten together socially I mean you guys have actually taken the idea of social coding--if anybody has ever been to a GitHub drink up, if you haven’t and you’re listening to this I highly recommend even flying into one if need be--but you guys constantly actually have these social interactions around your code so how does that actually vise what GitHub sort of sees as its philosophy or its place in getting people to be nicer and to interact.

Tom: Going forward, that’s what code iss going to be all about right-writing code on your own is becoming less and less of a thing because what hasnt been written yet are becoming more and more increasing complex and so its hard to do on our own. And so getting people to work well together in the coming years and decades is going to be where everything is about. It’s not going to be a solo hacker in his room writing MAKE or something right. It’s just not going to happen that much anymore because those things are already written and they already work really well and its going to be more and more collaboration.

Randall: Actually you know what, if I remember correctly, I think last night there was a GitHub drink up.

Tom: Indeed there was.

Randall: And Josh Susser said that you and the other founders of GitHub actually met at a like an IcanhazRuby ICHR meeting so you guys actually met at a social Ruby gathering.

Tom: Yea. That’s actually correct. I mean I knew Chris and PJ through the Ruby community and through the Ruby meetups and the IcanhazRuby was a special, elitist gathering of--

Randall: Oh I remember, if this is your first time at ICHR you must show code.

Tom: Yea. It was, I mean--when I say elitist it was really the VC’s weren’t invited was the main thing right. It’s not like it was hard to go to one, it was just that we wanted it a little bit more private because the VC’s had sort of encrouched upon the San Francisco Ruby Meetup and they were no longer tolerable to attend. We were kinda doing a grass roots new one that we could kind of control a little bit better that was much more based on code specifically that it was about VC pitches cuz its really hard to univite a VC once hes there. And its like, VC’s are great—I love VC’s and all but they should be going to those kinds of meetings, that’s not what they’re about.

Randall: Interesting.

Tom: But, yea that’s how we met. We met through social interactions, not through code specifically. I mean, through code incidentally but ya know—I need a little refresher on that.

Randall: A little more whiskey here.

Tom: A little more whiskey here. Please. Thank you.

Randall: You know what they say—by the time we get to this bottle we get to the bottom of the truth.

Tom: But I encourage—in my talks and things—I encourage people to go to meetups and if there isn’t one, to create one because that’s where you’re going to meet the people that you’re compatible with intellectually and those are the people you can code with really well because its hard to know just meeting them online, what they’re like and if you’re going to mesh really well. And I think meeting someone in person and having those outside of the code interactions is really worthwhile for writing the code because eventually you’re going to want to get together in person and if you don’t mesh at all in the real world then that’s kind an unfortunate thing, especially if you’re looking to start a business and meetups are a great place to do that—to find people who have similar ideas to you. That’s what you really looking for—people that have similar visions of the future.

Randall: Ok so its interesting right, maybe infamous is possibly not the right work but you certainly have a little bit of notoriety for writing a blog post—I think you recall called “Why I Turned Down a $300,000 Payout from PowerSet.” So you just mentioned the idea of finding co-founders and I think that a lot of people, especially in the community right no--they see themselves as where you, Chris, PJ and Scott were say a couple of years ago right. So given what you’ve learned over the past intervening years, where would you tell them to start, what would you advise them to do and what would you advise them not to do. What’s the sort of gem that you’ve learned?

Tom: There’s a few simple things that I identified in thinking about how we were successful early on and this is kind of specific to a bootstrap and I think bootstrapping is really great for a lot of reasons mainly because you don’t have to spend a lot of time raising money to do it. You can just start right away and as long as you have that in mind then you can accomplish really good things but you have to pick the right sorts of things to bootstrap and so if you are in that position and you’re looking to bootstrap something then what I suggest is find an idea that has the elements of virality and community—those things specifically meaning virality being the ability for something to propagate itself without a lot of effort on your part. So, for example, in the case of GitHUb, GitHub is viral in that if you put your open source project on GitHub and you send the link out to all the news feeds and whatever and put it on your blog and Tweet it and stuff, then you’re driving people to GitHub to look at your code. Now when those people get there and look at your code they are also discovering GitHub the service. So in helping you do what you do better, you’re pointing people at our company and in that way its viral. It’s viral because people want you to go to our site because it makes them more awesome by them seeing your code. So the viral component is essential because you don’t have a lot of time when you’re boot strapping or a lot of money to do advertising and to be talking to people constantly. You want your users to do that for you, and that’s virality. And that community is, once you have those people, how do you keep them sticking around. I mean its easy to be viral and short-lived.

Randall: Your a meme.

Tom: Exactly, exactly. So YouTube has a lot of virality but not a lot of community and while they’re still immensely popular--I think its because people like funny videos, not because its like an amazing service otherwise.

Randall: Yea, but actually you know, you’ve got a point because ironically though, its like Funny or Die--that’s kind of how they get their take off because they actually try to go with both virality and community right--in college.

Tom: Yea they’re trying to build communities of people that have commonalities and talk to each other and that’s really where the collaboration part of GitHub comes in--its getting people to know each other and work together and follow each other and be interested in what people are doing so that they’ll come back and see what those people are doing tomorrow--that’s community. So virality is getting people there and community is keeping them there and with those 2 things, you can build a huge user base with a very small amount of effort. So I would say think about those things if you’re thinking about building a company or if you’re in a boot strapping situation or thinking about doing it--think about those things because if you’re selling a internal management solution to businesses, you have to be on the street knocking on the proverbial company door everyday forever--there is no virality in that, there is no community in that. It’s not going to sell itself and once the people have it they’re not going to talk to other people and come back right. It’s just a completely different business. And the beauty of the Internet is that if you think about things in the right way and think about how you can build in those elements of virality and community--and you can do it in a lot of situations, you just have to understand that they need to be there. Once you start looking for ways to include virality and community in an idea, a lot of ideas are amenable to them you just have to add them in.

Randall: Ok. Then, what was the largest mistake you made after you boot strapped? Like what was the big thing like looking back now--you’ve got that moment, you smack yourself in the forehead you’re like, “Oh my God, I can’t believe I did that”?

Tom: I think probably the biggest thing is some of the financial stuff regarding how stocks work--are awarded to new employees and what not. Not coming from a really heavy business background and not having done a start up previously, you don’t really understand how options work versus actually stock grants and stuff like that and if you don’t get those things right in the first place, then it takes a lot of legal manipulation to get them correct again.

Randall: Legal manipulation? Ok, that sounds like something where like we’re going to edit that out of the podcast.

Tom: Well no, its not like that--its not like we’re doing anything illegal. It’s just that, you then have to go back and, on paper, fix those--you know, we’re not screwing anyone over and we’re not getting screwed over its just that if we would’ve done things a little bit differently to begin with as far as how things were--

Randall: How you structured them?

Tom: How things were structured then it would’ve been easier. Like, everyone still ends up at the end of the day with what they should have had, its just cost more money to get there. So I’d say look at the business things and especially know how stocks and stuff--how options and stocks and things work.

Randall: So that's interesting. Ok, so I know there are from time to time if you look on HackerNews, you see these things in search of the technical founder, and it seems there’s always a tech guy looking for a business guy and a business guy looking for a tech guy but they never seem to find each other. It’s like they don’t--you know the running blog post right. The technical co-founder doesn’t exist and then the counterpoint, the business co-founder doesn’t exist. So given that this is the case and given that you’re sitting here saying, “hey, I wish I had more business acumen about how to set things up” why do you think there seems to be this cross connector? This--why are these two ships passing in the night, I suppose?

Tom: I think, I think its because technical people and business people are generally very different in how they perceive the world and each other so the common thing is that the business guy thinks that his idea’s amazing and all he needs is some code jock to--code jock? Why did I say--code jock? Is that a thing? Some Code Jock. I wish there was such a thing back in high school, right.

Randall: Code jock? You could letter in code? Technically if you lettered in code what would you get? A C?

Tom: I suppose so? But, the business person perceives what the code person does as like, simplistic and the code person sees what the business person does as simplistic and so neither of them ever sees the other one--

Randall: I was going to say vindictive, but simplistic works.

Tom: It’s just--you don’t understand what goes into it and the knowledge that you have to be an expert at that situation. The business guy’s just like, “well aren’t you just opening up front page and typing in PHP” and they don’t even know what those things are, right. But to them its so simple. And to the coder, the business stuff is just like, “don’t you just file a bunch of paperwork”

Randall: Right.

Tom: But its not really like that. And until those 2 people understand and appreciate the complexities of each other, then they’ll always complain that there’s never a good fit for who they’re looking for. I guess--its the way I see it.

Randall: Yea well, that’s what I asked, contextually we’re curious. So what’s next for GitHub. I mean you guys actually have done a great job at engaging with multiple communities. You have the--I think you guys passed SourceForge like a while back right?

Tom: In some metrics. We passed them on a number of repositories, like a year ago, at least.

Randall: Right. And so, you guys are growing, you’re doing really well. So what’s next? What are the problems you want to tackle? What are the things that you think are going to confront social coding as a concept that you’re going to have to deal with.

Tom: In addition to just refining the website itself and the way that people work, making pull requests better and more powerful, making issues better and more powerful, making search better--and you know, there’s all these different things. But for me, what I’ve been thinking about is, we focus a lot on Git users and Git the technology and code writing specifically, and in doing that, are we missing out on a huge chunk of people that could use the site, even if they don’t do some of those things. So an example from when I was at start up school giving a talk there, one of the founders of AirBnB gave a great talk about kind of his journey from creating the concept to where they are now which is very successful. And they started out being very, very niche in that it was all about literally--air mattresses and breakfast in the morning. That was their thing--that was their idea and after a little while, they realized that they were pigeon-holing themselves too much and what people really wanted was the abilities to list properties of any type on their own without having to go through a listing agency. So instead of becoming, literally, air mattresses and breakfast in the morning, it was, “hey I have this rental property I’m not using right now and I can make money off of listing it” or I do have an extra room in my house but its a total room or maybe its even a separate part of like a guest house, whatever, and it has a real bed and no I’m not going to serve you breakfast in the morning--that’s stupid. Cook your own damn breakfast and once they realized that the market was bigger than it had originally pigeon-holed themselves into, they exploded in growth and popularity. And I’m wondering if maybe we are the same somehow in that maybe we are restricting ourselves too much artificially by focusing on people writing code and sharing code. Are there other things that technical people would like to do that we could facilitate. And that’s kind of purposefully vague because I don’t know the answer myself.

Randall: Nothing but questions, nothing but questions. So I think one of the things I heard you guys being bandied about is you guys actually have a conference coming up. Are these rumors true? Could this actually be?

Tom: The rumor is indeed true--yes. We are organizing a conference and it is going to be the weekend of April 8th and its going to be a 2 day conference. It’s going to be called CodeConf. It’s going to be all about code. We have a great set of speakers that have agreed to come there and I’d like to rattle off a few of their names. We have Wrench--is his handle but he’s the one who wrote Click-to-Flash which we--

Randall: Wait, I’m sorry. Did you just say wrench was his handle?

Tom: Yea, wrench is his handle.

Randall: Without irony, no pun intended. Not even like a--

Tom: Don’t worry about it Randall. I think on a different plane than you so to me it was obvious.

Randall: Oh yes, of course. You think on a different plane--inclined plane, is that it?

Tom: I’m very inclined. But Wrench--yea clicktoflash was one of the first really super popular projects that was a native OSX software--before that there wasn’t a lot of Coca on the site at all and he moved that over from Google code and people just started watching and forking it like crazy--adding features and made it way better. That was a crystallizing moment for me in really seeing that what we were doing as far as the collaboration stuff like to really see it viscerally happen at a rate you could perceive. Like you go there and the next day and there would be like 10 forks with 10 new features that Wrench had pulled in and now you have a new version now. And it was all because he made it easy for people to contribute by moving his code to GitHub. So that’s why we invited him and he’s awesome. We have Dr. Nic, who I believe you know.

Randall: Yea, I was going to say I’ve never met the man. An angry Australian if I remember correctly.

Tom: Indeed. Angry about what though?

Randall: I don’t know, maybe the fact that he’s truly Tasmanian?

Tom: Is he really?

Randall: Yea, actually I think he’s from Tasmania

Tom: Oh, awesome.

Randall: Yea cuz he got me some whiskey from--Tasmanian whiskey.

Tom: I did not know that they made whiskey there.

Randall: Trust me, until he brought it to me, neither did I.

Tom: Wait, I think we had this discussion actually--about importing Tasmanian liquor.

Randall: Yes indeed we did, yes. As I say, not to say that my memory is effected by my drinking at all.

Tom: Andy Lester, the author of ACK, only the greatest code searching tool of all time.

Randall: So, keep rattling off the names. It’s a long but illustrious list.

Tom: Mojodna is his handle.

Randall: Jesus--

Tom: Is that really his name? Mojodna?

Randall: For all you know, Osama Bin Laden could be appearing at your conference.

Tom: You know, half the people I know, I don’t know their real names. They’ll come up to me at drink ups and be like, “hey I’m John Smith--”

Randall: John Jacob Jingleheimer? I know him.

Tom: Yea--

Randall: The funny part is at one point in time I really thought that was my name too but as it turns out--no, no it isn’t. It was just a dream.

Tom: We all have dreams. But he’s going to talk about geo-location. Jacob Kaplan-Moss, one of the main people at Django, Jeremy Ashkenas from CoffeeScript.

Randall: O yea.

Tom: Which is amazing. Valerie Aurora is going to talk about the Linux Kernal, highly invovled there. Coda Hale, one of my favorite people, is going to talk about something, security I hope. I don’t know--as long as he is himself, that’s all I care about. Ariel Waldman who is an astronomer/technologist working on things like putting satellites into space, like small satellites. Doing really cool stuff with like space and what not so I look forward to seeing what she has to say. Amanda Wixted who works at Zynga, talk about iPhone code. Nicole Sullivan who does OOCSS--object oriented CSS which is an awesome CSS framework that a lot of people use. Ryan Dahl, Node.js--been doing a lot of Node.js recently.

Randall: You have?

Tom: Yea but a lot of its internal. We have an internal thing called Hubot which sits in our Campfire room and does things for us.

Randall: I heard but apparently it doesn’t talk to Arduino yet.

Tom: It doesn’t but it will.

Randall: Or will it?

Tom: Give it--I have it ordered, Arduino.

Randall: I was going to say did you get the program-prag book?

Tom: I haven’t bought the book yet, no.

Randall: There’s and Arduino book that just came out for Prag.

Tom: O really?

Randall: Yea. I haven’t read it yet but Prag-pragmatic its gotta be good. Ok, cool right so Doll’s there?

Tom: Yup, Doll. And Gina Trapani from LifeHacker and Mick Krieger from Instagram, one of my new most favoritist things.

Randall: Yea, actually I know. It keeps coming up--trust me. I see your life in like Polaroid style very flashy, strangely attractive pictures.

Tom: Yea, I love it maybe too much. But that’s the line up right now, subject to change, obviously, but we hope all those people can make it.

Randall: Cool. So what size is the conference going to be? Are you going for like small and intimate, huge and gigantic?

Tom: We want to target--we’ve been--the idea is to target more of kind of the business crew so that we can get companies who are going to try to foot the bill for people to come which makes it so that a wider audience of people can make it there and with that in light, then we’re going to target--I think that the number that I’ve heard is 300 which is not tiny but not huge

Randall: Yea that will sell out in an hour. Do you have a venue in mind?

Tom: Yes. It’s going to be at the Hyatt, right near the ferry building, got it. That will be the hotel where the conference is and where we will have discounted rates. Like, we really want to get people together in the same space.

Randall: Have you called Maker’s Mark and asked them whether or not they can fly in enough whiskey for PJ?

Tom: No, but that’s a good idea.

Randall: You should get them to sponsor the conference.

Tom: Sponsored by Makers.

Randall: Just saying.

Tom: Not a terrible idea.

Randall: So yea--actually there is one other question I had, curious as to how you deal it. A lot people--there are those who actually, forgive the pun “Git It’ and they use Git but then a lot of people the first time they come to Git especially if they were on CVS or subversion or something else or God forbid PerForce or something else like that, they seem to have a cognitive block and say that Git’s too difficult to use. So what are you guys seeing in terms of people--because you have so many new people coming into the community, what do you guys see in terms of people who are adopting Git--first time users. Do you get that a lot? Do you have people complaining about user use or burden?

Tom: It happens. It happens less, I think, because there are a lot of pretty good resources online now and people realize--they see Git everywhere, like well why am I not using this yet? Guess I better go check it out. They search for Git and they end up at Git-SEM.org which is a site that Scot Chacon has put together which is now the main Git site. and from there you can find the book’s he’s written and download it for free--you can download the PDF for free of Pro-Git by Scot. And I think that alone, him writing and releasing that book with the technical editor being Sean Pierce, probably, almost inarguably the number 2 person in the Git core project itself is the technical maintainer so everything has been vetted by someone who knows everything. And that book alone and being available for free and Scot lobbied really hard to make it available for free because he wanted it to get to a broader audience and he knew it was important for that to happen so his creative common’s license and that event alone I think changed the perceptions of a lot of people. Here’s a book that’s written in the way book should be written that’s approachable and I can go through and I can actually understand this stuff instead of trying to wade through the man pages or the documents that come with Git which are written by the Git developers which I love but are not, you know--their specialty is not writing help documentation and so being approachable in a book format, I think is huge. And so, education is a lot better now and people have that problem less and as GitHub gets better then people have even less complaints because they can understand how things work like pull requests and we give them hints. We say, “here’s a pull request and if you want to merge this, here are the commands to run” and so reducing that friction between getting something and acting upon it, we try to do as much as possible and in doing that, reduce the stress that people have using get because they don’t have to worry about memorizing 17 options to some Git command. But we are constantly working on making that better. Education is gonna be another focus of this year and really getting together some truly excellent resources and putting together all the resources that we have right now that are kind of spread all over the place. Like Scot has like 5 different websites that have videos and tutorials and stuff because Scot likes to produce stuff. Scot is not as good as curating and putting stuff all together. He’s an idea man and an execution man but like getting them coherently together is something that we want to move towards.

Randall: So is what you’re telling me is that you’re having problems merging Scot’s--

Tom: No, its just that Scot is just a fountain of awesomeness and we’re just trying to build a system of pipes capable of routing--

Randall: Ok this strained eating scenario fails right here because otherwise--

Tom: You know what I’m getting at.

Randall: Yea. So what else? You’ve got a great audience, you’ve got this platform here. People are interested in what you’re doing so what else do you want people to know? What should I have asked you but didn’t --I failed too miserably because I was too focused on that bottle of whiskey.

Tom: Just staring at it. We covered a lot of the stuff I like to impart upon people some of the business stuff, some of the company stuff. GitHub FI, we’re going to be doing a big push on FI is the firewall install, the locally installable version.

Randall: Yea actually you know what, that actually brings up a very good question which is what do you guys see as the adoption rate of Git inside of enterprises because FI was designed to be used by people who just can’t host their assets outside of their own environment so are you guys seeing a lot of adoption, a little adoption? I mean are the perforce guys running scared yet?

Tom: I don’t know that they’re running scared although, I think the landscape is starting to change a little bit in that we are seeing a lot of interest in FI from larger companies. I mean we have AT&T Interactive as using FI, Zappos is using FI, a lot of companies are starting to come to us and say, “hey we’ve heard about this. a lot of our developers thing it would be great to have internally” and some of them you wouldn’t expect and some of them I can’t even say their names because they’re so enterprisey that they don’t even let you do that. I mean there’s a lot of enterprises and we’ve only seen a few of them but in that we’ve seen any of them, I think is really promising and we’re going to see more and more--its inevitable.

Randall: There we have it. Git is inevitable. Yea, thanks so much for the time and I will be happy to come by any time and bribe you with another bottle of whiskey so we can chat. Tom: I think that’s a pretty good arrangement.

Randall: Works for me. Thanks for tuning into Cloud Out Loud podcast. As always, send any questions to us at Engine Yard or the crew over at GitHub. You know how to find us if you’re actually listening to this. Thanks.