Subversion conversion is completed; only small things left to fix.

Wednesday, 4 May 2005 | Zander

At this time, when I upgrade from the old cvs to the shiny new svn; I can't help but compare the experience to real changesetbased and disconnected revision management software.

You all know the problem; you are away from internet and still want to hack on your project. If you are anything like me then you depend on your revision management system for a lot more then simply committing your work at the end of the day. For example; I check what changes I made, or revert to the latest committed version quite often, since that allows me to be quite destructive in my search for the perfect way to do this bugfix, or this new feature. Due to this behavior I also commit a LOT more often then just at the end of the day; a typo-fix is one commit. A repaint fix is another and that new feature is definitely a seperate one.

I went looking for a revision management system that supports this all the way back when all I knew was cvs. The moment svn hit 1.0, I downloaded a generic book on the software from O'Reily and read through it. I was highly dissapointed. svn is like a cvs v2. All known problems are fixed (and some new ones introduced) and thats it. Feels like upgrading from W95 to W98 (or office97 to officeXp). Not something I'd do unless I really had to. Some time ago I found darcs (www.darcs.net) as an alternative; I immidiately fell in love with it :) Its incredably usable, simple to use and much more powerfull then cvs or svn!

Its disconnected nature is my number one feature. Its not clear why thats usefull until you actually start using it and start adjusting your workflow in ways you never thought was appropriate before. The commit action you know from cvs is devided into two parts in darcs; you record your changes first (giving it a comment and all that) and then you push your changes to a central repository as a separate step. Since its 2 commands you can record more then one item before you push your changes. This is immidiate usefull when you are not connected to the internet! And you definitely will commit more often if you never have to update before you commit, since committing itself takes seconds since there is zero internet traffic going on.

Oh; how often do you create a branch? Branches is one of the reasons KDE is switching source management software, in cvs they suck; in svn they suck just as much; but are at least much more visible (a mixed blessing). After doing branches in darcs you'll agree that that is the real way to do it. Just do a new checkout and start committing there and you have yourself a branch. Since each checkout is a repository (complete with history) in itself the patch-flow is only limited only by your imagination. For example; I have a project (pvr) that runs on a dedicated machine with a tv-card, which is located in my living room. I develop on my workstation and whenever a patch is finished I push it to my pvr, over the network. After testing it on the real hardware I either amend the patch (re-record, if you will) or I push it directly to my server which contains the main repository. Any patch not-yet-pushed to the server still needs to be tested. At the end of the day KDE is the largest svn repo out there, and darcs is just a new kid on the block which probably makes it a bad choice for us. Not to mention that there are known scalability issues that need work in darcs anyway. So; darcs is getting there, but not just yet.

What does confuse me is why people say svn is changeset based. There is indeed a global revision number, a commit in kdebase will make koffice get a new revision number as well. But thats it. None of the advantages of being changeset based are present! First scalability; if there have been 10 commits (changesets) that I still need to download to my laptop, the only thing darcs does is check the list of patches, and download those 10 patchsets. Svn checks version for each and every file in the project. My darcs changes list is a project wide list and that makes an update take the same amount of time if I was managing 1 file or if I was managing 1 million files. Second; what is a changeset? When a number of files are changed and commited in 1 commit, those files together make a changeset. They are combined because downloading only a part of that changeset would be useless. Think moving a method from one file to another, you need to update both files at the same time or your compile will break. For this reason all updates should be project wide; never should you be allowed to update just one directory. It can break your compile.

So; svn surely is not changeset based (it just happens to manage them), then why is it an advantage to have project-wide numbers in svn exactly?

update Asking this question on #svn, I got the answer that indeed svn is not changeset based, it just has a backend that kind of looks like it does. The global numbering is just one way to allow (directory) renames, there are not really any advantages besides metadata-management.