[linux-elitists] My first look at BitKeeper. (fwd)

Eugen Leitl eugen@leitl.org
Thu Mar 13 23:21:43 PST 2003

---------- Forwarded message ----------
Date: Thu, 13 Mar 2003 16:20:53 -0800
From: Adam Rifkin <Adam@KnowNow.com>
To: Rohit Khare <rohit@ics.uci.edu>
Cc: Ben Sittler <BSittler@KnowNow.com>, Tommy Hui <thui@KnowNow.com>,
     kragen@pobox.com, gregburd@mac.com, wsanchez@wsanchez.net
Subject: My first look at BitKeeper.
Resent-Date: Thu, 13 Mar 2003 16:33:24 -0800
Resent-From: Rohit Khare <rohit@ics.uci.edu>
Resent-To: Fork@xent.com

[Adam found these bits... RK]

BitKeeper itself seems like a really nice versioning system for 
distributed development of big projects...


CVS has a single repository model. Each work area is clear text only 
which means no revision control in the work area during development.
BitKeeper provides staging areas. You can mimic CVS by having one 
master repository and several work areas. You can also extend that to 
have one master and several staging areas with several work areas below 
each staging area. This allows people working on related projects to 
merge amongst themselves before merging into the master. Anyone who has 
lived through a change that broke the build can see the value of 
staging areas.
Merging in CVS is primitive at best.
Branch management in CVS is a nightmare.
CVS has no change sets, i.e., no atomic commits of changes which span 
CVS has no rename support.
CVS was based on RCS and still has RCS' limitations.
On the plus side, CVS is free, works well enough for some development 
projects, and CVS repositories are easily converted to BitKeeper.


Perforce maintains state in a database next to the RCS files. In order 
for this state to be consistent with the RCS files, you must access the 
RCS files only through the Perforce daemon. The database is a single 
point of failure; if it gets corrupted, your source management system 
does not work. The real problem is that when the database gets 
corrupted, there is a high chance that you need Perforce to straighten 
it out.
The Perforce daemon is a bottleneck. Long running operations lock out 
all other users. This isn't a problem with small repositories, only 
with large ones. Scalability is an issue.
Perforce uses the RCS file format with all of the problems that entails.
The database can use a dramatic amount of disk space.
The main issues are scaling and reliability.


Among the projects hosted by BitKeeper are MySQL and Linux


BitKeeper's dual-license scheme is really clever about how it enforces 
the scheme...


Here's a writeup on that...


Not quite Open Source

Larry McVoy is out to change the way cooperative software development 
is done, and he may just pull it off. But he also seeks to make a 
living from his work, and his way of achieving that goal has put him in 
conflict with the Open Source Definition. His novel way of extracting 
revenue from proprietary software developers may well fund the creation 
of a great new free software tool, but it also has shown that "Open 
Source" is not everything.

Some background is in order. Larry has built up an impressive résumé 
over the years, with stints at places like SCO, Sun, SGI, and Cobalt. 
Much of that time has been spent hacking on one kernel or another, and, 
at Sun, putting together configuration management tools. So when he set 
out to create a new free tool to address some of the problems that have 
come up in the Linux kernel development process, he had a lot of 
experience to bring to the task.

The result, a system called BitKeeper, is now nearing readiness. 
BitKeeper provides all of the features of systems like SCCS or CVS, and 
a lot more. BitKeeper was designed from the beginning to work with 
multiple source repositories, and to facilitate moving patches from one 
repository to another. Included are some nice graphical tools for 
managing and merging patches. To learn more, see the BitKeeper web 

Larry's stated goal is to have every free software project using 
BitKeeper within a few years. He may just get there. The multiple 
repository scheme is designed to work well with large, 
globally-distributed development teams. The patch management allows for 
the handling of changes, and for filtering these changes on their way 
up to the "master" repository. In the Linux kernel case, this means 
that Linus can benefit from much greater peer review of patches before 
he has to see them. With some luck, the result should be a reduction in 
the number of "Linus does not scale" burnouts that have occasionally 
halted kernel work in the past.

As part of Larry's approach to world domination, he intends that 
BitKeeper be freely available for any free software development team 
that wants it. That includes source availability, ability to distribute 
modified versions, etc. But Larry also wants commercial software 
companies to use his system, and he would like for them to pay for the 
privilege. After all, he estimates that about four person-years of 
effort have gone into the development of the system; it would never 
have happened without some expectation of a return on that investment. 
And it's his way of getting them to pay that has put him in conflict 
with the Open Source Initiative.

To understand the problem, it's necessary to understand two features of 
BitKeeper and its license. BitKeeper includes a logging feature. Once 
multiple repositories are in use, BitKeeper will log all changes to a 
central server; these logs will be made available via a web page. Thus 
anybody can go to the web site and see what's happening with any 
development project out there which is using BitKeeper.

BitKeeper's license allows for modifications, but under one 
restriction: all modified versions must pass a regression test. Other 
free systems (i.e. perl) have regression tests in their licenses, but a 
modified version which is unable to pass the test simply loses the 
right to use the original name. Versions of BitKeeper which fail the 
test may not be used at all. And yes, the regression test checks to be 
sure that the logging feature has not been removed or disabled. If you 
turn off the logging, you violate the license.

The reasoning behind this move is the following: Larry believes that 
free software projects want their work to be in the open anyway, and 
will not be bothered by the logging. Since the logging only kicks in 
when multiple repositories are used, individuals using BitKeeper to 
manage their diaries will not be affected. Proprietary vendors, 
instead, are not likely to be happy with having their change log 
messages broadcast to the world. For them, this restriction will 
probably make the system unusable.

At this point Larry shows up with a deal: the commercial version of 
BitKeeper doesn't do public central logging - you can direct the 
logging to an internal server. Pay the price, and you can use the 
system with your privacy intact.

There are a number of other features to the BitKeeper license. 
Subsections of the code - generally library modules that could be 
useful elsewhere - will be available under the GPL. If the logging 
servers go away, or if work on the system stops for two years, the 
whole thing goes GPL.

But that is not good enough for the "Open Source" designation, because 
the regression test requirement breaks the rules. Larry discussed the 
issue at length with the OSI folks, and was not able to get them to 
bend on the issue. He has since given up. BitKeeper is not Open Source.

The interesting thing is that, on a list for kernel hackers who intend 
to use the system, nobody really cares all that much. Even members of 
the OSI board have posted there, saying that the license is a good one, 
and that the lack of the "Open Source" designation should not be a 
problem. BitKeeper is free enough for that crowd, and they tend to be 
pretty fussy on these things.

So we have a situation where a license widely regarded as "free enough" 
does not qualify for [what is supposed to be] the free software 
community's mark of recognition. We may be seeing the future here: more 
"commercially crippled" licenses may well appear as more developers try 
to make a go at making a living from free software. When a lot of "free 
enough" software is no longer "Open Source," what becomes of the 
certification mark? Will people care about it any more?

Maybe the OSI should consider adopting a multi-tier designation. The 
top tier could be reserved for fully free code - perhaps with an even 
more restrictive set of criteria than what they have now. Lower levels 
could then be used to recognize software which is "free enough," but 
which does have some restrictions. Doing so could help the community 
distinguish between the incredible number of software licenses which 
are coming out, and could also help to preserve the relevance of the 
Open Source certification mark.


SCM systems are often a productivity bottleneck. Inexpensive entry 
level systems don't solve the problems you need solved. Traditional 
high end systems are resource and administration intensive. BitKeeper 
is light, fast, and exceptionally simple to use, yet it offers advanced 
features not found in even the most expensive traditional systems. If 
the following list sounds familiar, BitKeeper is right for you.

Merging. Do your engineers spend too much time merging? BitKeeper has 
the best-in-class merge algorithms and merge tools which reduce merge 
time to 1/10th of the time required by other tools.

Renames. Do you want to reorganize your source tree but can't because 
the SCM tool doesn't properly track file names? BitKeeper gets this 
right, files may be renamed at any time, in any work space, and the 
renames are handled correctly in all cases.

Geographically distributed. Do you have teams in more than one 
location? With centralized client/server SCM systems, all the remote 
teams suffer. BitKeeper is a peer-to-peer system based on a replicated 
database. All teams become local and enjoy local performance in a 
replicated system.

Work flow. Are you stuck in your vendor's idea of work flow? Ever 
wished you could modify it to suit your needs rather than their idea of 
your needs? BitKeeper is a peer-to-peer system, arbitrary work flows 
that match your changing needs are no problem.

Reproducibility. Do you ever have to roll back to fix a bug in an 
earlier release only to find that your SCM system doesn't support that 
or get it right? BitKeeper guarantees 100% accurate rollback of all 
file contents, names, and permissions without requiring any forethought 
on your part. While other systems require that you remember to tag the 
tree, BitKeeper has no such requirement; all changes are potential 
rollback points.

Performance. Do you have to wait because your server gets too busy? Are 
you tired of spending more money on expensive machines to keep up with 
the load? BitKeeper's replicated nature spreads out the load over all 
your machines. A small and cheap PC can easily support thousands of 
developers. It would cost more than a hundred times as much to do the 
same thing with other SCM solutions.

Reliability. Do you have to wait for your SCM vendor to come unscramble 
their database? How about waiting on the overloaded or crashed SCM 
server? BitKeeper is based on a replicated database design which means 
the main integration server can crash without causing a problem. It is 
possible and easy to guarantee 24x7 uptime with BitKeeper.

Data integrity. Have you ever rolled back to fix a bug only to find 
that version of the database is corrupted? Most entry level SCM systems 
are based on the RCS file format and it is commonplace to have 
undetected corruption in those files. You'll find out when a customer 
insists on a bugfix in an old release and you can't get at that data. 
BitKeeper will tell you immediately if you have data corruption and can 
help you fix it.
Reviewing and debugging code. Do you ever want to see all changes 
associated with a particular change in a file? Two clicks in BitKeeper 
will let you see that for any change in any file. We depend heavily on 
this feature to provide fast and accurate support to our customers. 
Without this feature, we would have to increase our technical staff by 
a factor of three to maintain the same level of support and 

Time to market. Do you need to get to market quickly, ahead of your 
competitors? BitKeeper will help you by reducing the time engineers 
spend merging, catching integrity problems as they happen, allowing 
work flow which matches your process, revealing quickly how and why 
changes were made, and providing excellent performance as you grow.

Cost. Do you spend as much or more on hardware and support personnel 
than on the SCM system itself? You are not alone, that is common for 
any medium or large installation. The replicated nature of BitKeeper 
means that a PC will work fine and there is no need for full-time 

Support. Do you ever have a problem or a question and spend 30 minutes 
on hold waiting for an answer? Does your SCM vendor relabel support as 
Professional Services and charge you extra? Our support is without 
equal in the industry, we are responsive to your needs and will work 
with you to deploy BitKeeper effectively, at no extra charge. Our 
customers frequently describe our support as the best they've ever 

More information about the linux-elitists mailing list