Last week I put a git repository for Nu on github. Over the weekend, I started pushing some of my smaller open source projects there, too. I think that github is a great, great tool, but that its value comes from something bigger than a site design, git, or any of the projects that it hosts. Let’s talk about that.
Get with DVCS
Distributed version control is a revolutionary change to software project management. If you haven’t seen Linus Torvalds’ git talk at Google, be sure to watch it and think about the political problem that Linus solved with git. With traditional version control systems like CVS or Subversion, software is maintained in a central repository that’s controlled by one or more committers. As a project grows, more and more committers are added, and the burden on and trust required in each committer also grows. That’s because even though they share responsibility, each committer has the ability to break a project either directly, by submitting bad code, or indirectly, by making changes that take the project off course.
If you ask Fred Brooks, one of the biggest threats to a software project is the loss of conceptual integrity, and here he describes how to get (and keep) it:
To make a user-friendly system, the system must have conceptual integrity, which can only be achieved by separating architecture from implementation. A single chief architect (or a small number of architects), acting on the user’s behalf, decides what goes in the system and what stays out. A “super cool” idea by someone may not be included if it does not fit seamlessly with the overall system design. In fact, to ensure a user-friendly system, a system may deliberately provide fewer features than it is capable of. The point is that if a system is too complicated to use, then many of its features will go unused because no one has the time to learn how to use them.
“But,” it’s fair to ask, “how can this scale?” That’s where DVCS comes in. Each participant in a project has a complete copy of the repository that is functionally at the same level as the project leader’s. In Thomas Friedman’s words, with DVCS, “the world is flat” for programmers. All participants have the freedom to make whatever innovations they want and the opportunity to promote their changes to the public. Because all repositories are equally functional, the project leader has the ability to act as Brooks’ chief architect, pulling and filtering changes back into his or her personally-branded version. In the case of Linux, Linus has organized kernel development with loose layers of deputies that filter and tune submissions that he can eventually personally pull into the official Linux kernel. That’s a very robust organization, and it’s one that allows both projects and participants to compete and evolve.
What about Mercurial?
I know that many people are wavering between git and Mercurial, a similar distributed VCS written in Python. Bluntly, one reason that I prefer git over Mercurial is that git is written in C, and that means that I have more opportunities to integrate git in C-based systems that I might write in the future with Nu. But a bigger reason to go with git is community, and there I see github leading the way. Git on github makes it easy to fork and share repositories within a community of users, and as more people use git, its tools will only get better.
I’m putting my open source projects on github and inviting all interested programmers to fork them and start making improvements. In time, I expect that some or all of these projects will move to new leaders as others get involved who can better manage and drive them. For those that stay under my direction, I hope to work with a flexible and self-organizing community that builds software on a network of merit and trust.
But there also are lots of open source projects that I use and care about that I didn’t help write, but at times want to change or enhance. I’ll bet some of it is yours.
Are you ready to join us?
A couple of tips for moving git repositories to github:
- Make sure that you have your ssh key properly configured. Fortunately, there are great instructions online, but I tripped up by not following them exactly.
- If you are pushing an existing repository up to github, the site instructions tell you to set the origin of your local repository to your new github repository. Rather than change my origin, I told my local repository about another “remote” repository that I called “github” and pushed it to github like this:
git remote add github email@example.com:timburks/nu.git git push github
It occurred to me that there is some common law that we need as a community. A lot of this is already established, but I wanted to put two of my own opinions in writing:
- The first regards project ownership. To me, it’s mainly a question of naming. Any properly-licensed open source project can be forked, but out of respect for the project’s creator and users, a fork should always get a new name. Here I mean a “big-F” fork, where a new person steps in and claims leadership of a project or when a project decides to split. github-style “little-f” forks shouldn’t involve name changes, but it should always be clear who’s got the definitive source.
- The second issue is licensing. I would prefer to work in a community that makes the fewest impositions on the future users of its intellectual products. That makes me lean toward the new BSD License, but there is one other important issue looming: software patents. To me, the best response to this is the Apache License. As I understand it, beyond guaranteeing basic freedoms, the Apache License creates a community of users and developers who agree that within that community and for the covered software, software patents will not be an issue. I think this is very important, especially for individuals who will be collaborating through sites like github. So for new projects, I prefer the Apache License and will be working to relicense Nu under the Apache License in the near future.
That’s it. Email me if you need a github invite.