Hey, remember Windows 2000?
"I am sorry user, I can't let you run that."
The Apache Software Foundation has decreed this week to be the 20th anniversary of the source code management system, Subversion. So, happy birthday SVN! The Subversion project was kicked off by software outfit CollabNet in 2000. The plan was to create an open source version-control system that worked a bit like the Concurrent …
worse - remember SOURCE SAFE and storing its repo on Win2k?
"I'm sorry developer. Your source history and files are all in the bit bucket now"
Source UNsafe was my reason to use either "nothing" or CVS, and then SVN. Git came along after I'd already put everything in SVN. A customer had Perforce set up in 2007-ish though. They liked the fact that P4 also had windows servers [which I strongly discouraged]. And then the embedded Linux code for something with a 2.6 kernel had a few files that were different names but only in 'case', and the windows-based repos borked it up, requiring "managers" to re-think what I'd said. We moved it to linux-based of course, but IN A VM at THEIR INSISTANCE, still running on a WINDOWS SERVER [because manager *FELT* it was "more reliable" that way, go fig...].
My preference of course would've been to have done it on Linux with svn (or even CVS) in the FIRST place. [this was about the time I'd set up my repo in svn and also when FreeBSD migrated to it for the project, as I recall]
But I'm glad I stuck with svn. It's easier, in my opinion, to have an svn client/server thing running than to host your own gitlab or similar. Then I do nightly backups of the "svnadmin dump" and I'm good.
another good thing - it doesn't have a BOATLOAD OF FEATURE CREEP in it. Repos made 10 years ago probably work as-is without re-loading them. At least, that's how it seems to ME...
After using Perforce at work, I set up one of my Raspberry Pi's as a P4D server for various other Linux and Windows boxes. It worked very well until 2016 when Perforce decided to not only drop support for ARM, but also erased it from history by removing all traces of previous versions from their web and ftp sites. I accused them of being right gits, and switched to git right away.
It has a better backend, but it is *less* versatile than CVS. Like Git later, it was designed with a single workflow in mind - just not all developers can use that workflow because not all projects can be managed the same way.
That's often a huge issue with open source project, they are designed with a too strict vision - after all you're not paying for them, so why should care about your needs?
At last SVN under Windows has a compact native installation. Git under Windows is a mess of different pieces of software badly glued together.
git was designed for developing the Linux kernel. That's all it was for. Calling it a 'problem' is a bit odd when that was the whole idea; if you want a tool for a specific job, why would you try and write it to solve everyone else's problems?
It happened to still be better for a lot of other projects' needs than anything else around at the time, and so it got adopted by a lot of projects, and edged over a threshold for a network effect where it's now kind of The RCS For Everything and everyone uses it even if they say they hate it. But it was never designed for that at all, it's just an accident.
I don't really get this "similar to CVS" claim. I have used both, and find SVN works totally differently. For example, you don't have branches, but separate directories that act as the branches. OK, it works, but it is nothing like how CVS does things.
Version numbers in SVN also are entirely different, not 1.2, 1.3 ... but a huge number without any structure that counts commits since the beginning.
In all, CVS and SVN have about as much in common as CVS and Git.
Subversion is much more like CVS than it is like git. And, yes, I've used all three extensively. I've also used SCCS, RCS, PVCS, CVS NT,1 CMVC, AccuRev, and other change-management systems on a variety of OSes.
CVS itself started as a wrapper around RCS, and consequently had a very RCS-like architecture. Originally it simply suppressed RCS's check-out / edit / check-in cycle, which locked files to prevent simultaneous updates, in favor of merging upon check-in. It maintained other essential RCS elements such as line-level change granularity, file-level history granularity, storing a full head and reverse deltas, and using a flat file for each versioned file's entire history. CVS moved the history files out of the working tree into a repository; that was its second big departure from RCS.
Eventually CVS variants began supporting remote repositories. Of course that was also possible with original CVS using a distributed filesystem.
Subversion extended versioning to include filesystem trees as well as individual files, and replaced the internal branching mechanism with a virtual-filesystem-based CoW one. That's really not a radical departure from CVS, since it mostly replaces numbered branches with string-labeled ones, and CVS could already do that with "sticky tags", which is the technique recommended by the CVS FAQ. Subversion also switched to versioning the entire repository rather than individual files, which was a huge improvement on CVS (which originally had to version individual files because it used RCS as the underlying engine); but again compared to many other change-management systems that's not a huge difference.
Since it was first created Subversion has evolved further away from CVS, just as other descendants of CVS have. But many aspects of the model are the same: the modify-merge-commit cycle, file-level change management and line-level diff'ing, checking parts of the repository out into sandboxes, branching and merging, and most of all the use of a central repository.
git is very different. It's a distributed VCS, so there is no central repository, except by convention. Even if some project uses GitHub or another git "server", there's nothing to stop contributors from cloning the repo and then never using the server again, but simply pushing and pulling changes among edges of some other graph. git, like some other DVCSes, does not mandate a central node or any other hierarchy among project participants. The "repository" is the set of all copies of the project.
git does not track changes in files; it tracks changes in the entire project. The packfiles mechanism makes the heuristic assumption that files with the same name likely have similar content over time, but doesn't depend on it. git also handles metadata changes in the source tree, such as file renames, heuristically, by attempting to associate files in one revision with files in the preceding one using names, creation and removal, and content as hints. That's a huge difference from either CVS (which is fundamentally file-oriented) or Subversion (which is fundamentally filesystem-oriented).
git's push-pull mechanisms and the workflow they endorse are very different from the CVS or Subversion workflows.
There are other differences; for example, git implements multiple merge strategies.
I'm not a fan of git - it solves problems I don't have, and the command-line tooling remains an ugly mess. (They're still better than any GUI alternative, of course.) And I have no patience for people who complain about the difficulty or expense of Subversion branching or merging, neither of with are difficult or expensive (particularly since Subversion added merge tracking). But I recognize that git does address use cases which centralized VCS systems do not deal with well, and that it (like other DVCSes, but generally to a greater extent) is fundamentally different from the SCCS family, which includes CVS and Subversion.
1CVS NT started as a port of CVS to Windows, but evolved into a distinct system.
Yes, but the counter argument that Linus Torvalds advanced, if you're trying to fix CVS then you're doing it wrong.
One thing I've noticed is I have a Git repository which is 8 years old and has had 10,000s of commits by various people and it still takes up less space on disk than a SVN snapshot with the pristine folder. The entire history from the first commit. And it works without network access except for push/pull compared to SVN where more or less every operation except for a straight diff against the pristine copy needs network access.
The one place I'd favour Subversion over Git or another DVCS is for binary document storage. It's way easier for managers to work with SVN than it is for Git and frankly there is little need for branching and merging strategies on documents.
Linus justified the point if you care to look for his comments. But it's completely borne out by what has happened since.
The main advantage of a DVCS is it allows people to create branches that nobody else sees. It allows them to work offline, and do complicated diffs or merges without choking the server. It allows the central server (if there is one) to be free of useless tags and branches. Devs don't even need to communicate with the server to do any of this so it is MUCH faster. It also allows esoteric and non centralised models (e.g. we have one repo where we pull a branch from one remote source, merge it to another branch and push to another remote branch). The point is that a DVCS is flexible, robust, works in isolation and is FAST.
From personal experience I know this. We used to have some 30 CVS repositories containing about 50,000 files of code. About 6 years ago I got so fed up of tagging / merging / synchronizing taking an hour or more every damned day, that I volunteered to upgrade it all to either Subversion or Git and evaluated both. Subversion works like CVS so migration would have been easy. But the long term advantages of Git (outlined above) outweighed all that so I wrote up a bunch of workflows, dealt with the culture shock and moved everyone to that. It was the best decision I've ever made.
for me, merges work best if I use a tool such as 'meld' on the individual files, and try to apply it intelligently. but yeah merge tools are always where the best enhancements for usability can be made.
RapidSVN has/had the ability to invoke a merge tool (like meld). But that application seems to have lost support somewhere, and won't compile with the latest svn libs last I tried it.
Branching and merging worked from the start. Tagging was replaced by branching even.
SVN doesn't internally administrate as finely divided though, which lessens merge performance in more extreme cases. But that is something different. In cases both sides have changes in the same lines it sometimes borks.
(but a good 3rd party three-way diff tool as Beyond compare can even resolve many of those)
Branching has not been problematic in Subversion since at least 2007 (I don't have any notes older than that handy).
Merging has seen a series of improvements, though frankly I never thought it was particularly difficult if approached with some discipline and sense. Merge tracking (Subversion 1.6, from 2009) greatly simplified merges, though the series of mergeinfo fixes in the 1.6 and 1.7 point releases corrected issues for a number of use cases.
There were important improvements to versioning file additions, deletions, and renames in 1.5 (2008) and again in 1.8 (2013). Merges can still result in tree conflicts, but again they really shouldn't be difficult for an experienced user to resolve.
I do a lot of branching and merging with Subversion, including merging branches into one another and merging branches back to a trunk that has seen significant change since the branch point, and they rarely require any significant effort.
We still use svn at work for some significant projects: they've been around long enough that it was the logical (only?) choice, it still works, the project still works, so no real need to change.
We also use git for other, newer, projects. Speaking personally, I don't like git as much, although that could partly be because I'm still less familiar with it. The three-stage work copy, local repo and then remote repo setup can be a bit confusing when working on group projects, and I never feel entirely comfortable that i'm not going to inadvertently stomp over things (it seems to me to be a lot easier in svn to check what's changed remotely versus locally before trying to make updates either way...?). I guess I'll get the hang of git properly eventually.
SVN at least in some implementation had decent ACLs on the repository - something CVS had too.
Git on the other end being totally designed for projects where everybody can see anything is a pain in the ass in projects when not all code should be seen by everybody.
Also, is no surprise that most users using Git don't manage it themselve, and require frontends built by someone else (in exchange for your code to be available on their servers...) because really, tools without good management tools are really nasty software, but that's what you get today in the lame world of open source. When you have to manage many different projects, it becomes quickly a mess.
Git on the other end being totally designed for projects where everybody can see anything is a pain in the ass in projects when not all code should be seen by everybody.
It is an allied annoyance that you cannot check out a small section of a repo in git, but always have to download the whole, multi-gigabyte in many case, lot.
We have 'switched' from svn to git (Bitbucket) for new projects - but are still following a single central repository model.
At least we no longer have the manager that did not know that the SVN Book and other style guides existed and made their own standards for usage. [like renaming tags to track deployment]. I guess it was some sort of job security ploy, but it made it difficult to know just what version of software was actually deployed [no 'at a glance', needed to look through email chains and QA test logs]
He also imposed a unique way of using Ant
The staging / push concept in Git does look a little strange but it serves many purposes:
1. It's one final ass-saving chance from pushing something which is broken, or to modify your commit before you send it, e.g. if you missed a file or have another change to make.
2. You can make your own local branches without polluting some central repository.
3. You can squash all the commits from one local branches to another or do any other surgery you like to your repo then apply the work when you're ready.
4. You don't need network access except for push and fetch. i.e. you could be committing, diffing, merging or whatever to your own local repo.
5. Your repos don't need to follow some conventional centralised model, e.g. we had a repo with two remote sources - one an open source repository and another which was our own, and we could pull from one and merge to the other.
So it certainly looks a little clunky but it has a lot of advantages.
The main place git doesn't work is document management. Managers prefer to lock files and commit rather than dealing with branches and merging which make no sense in binary files any way. Git also kind of sucks for binary in general although things like large file support somewhat mitigate this issue.
Managers prefer to lock files and commit rather than dealing with branches and merging which make no sense in binary files any way
Branching is perfectly reasonable for "binary" (non-text) files. Branches are different incarnations of the project; there's no reason why they couldn't contain different versions of non-text files.
Sensible diff functions can be defined for many types of non-text files, such as the popular compound-file-most-of-which-is-XML-anyway file types (e.g. ODF); or even compound files composed of elements which don't have a sensible diff, such as JARs.1 For many other document file formats, sensible diff functions can be created by using a smaller grain than lines (Postscript, RTF), or by a diff that understands the format (PDF, pre-OOXML Microsoft Office documents). And where there's a sensible diff, there can often be a sensible merge.
1Frankly, if you want to get creative, you can design sensible diff functions even for those Java bytecode files and many other non-text executable formats, by finding common code blocks. And for any byte-stream file you can always do byte-sequence alignment with MED, though interpreting the diffs might be difficult. (That's essentially what VCDIFF, from RFC 3284, does.)
It's not reasonable for multiple reasons.
1. Managers are not developers so complicating their lives for no reason is a recipe for disaster. Dont' make me laugh with the idea of making them comprehend branching, merging, rebasing, staging, push, pull etc.
2. Binary merging SUCKs and always will. How do you merge a spreadsheet?
3. Git bloats with every binary committed to it and every clone gets that bloat. At least a centralised checkout hides it from users. Git LFS is not an option, see 1)
4. Managers often want to checkout just one folder, not the entire repo.
Git is by far and away better for developers. For managers, you're needlessly complicating things by using it.
Shortly after I started a new job back in 2005 they switched over to using SVN. (I presume they were using CVS before that, but I really can't remember.) About two months later they switched to Perforce. A product I still use at home to this day. I'm forced to use Git (and SmartGit because Git on its own is totally useless) at work these days, and it still doesn't come close to Perforce for performance and ease of use. As for SVN, I've not given it another thought until today.
Without Karl I would be crippled. (But not because of Subversion: because he's one of the very few users of the Maltron keyboard, and back when my RSI started to bite I asked him if it was any good. He said it was. He was quite thoroughly correct.)
One thing Karl has is good taste. The interior of Subversion shows that: it's lovely, enormously extensible, and far more cleanly architected than the interior of Git. However... most of that complexity, in hindsight, is epicyclic: you don't need it if you start from the right place, and in hindsight, Git started from the right place, and Subversion didn't. Of course, Subversion *couldn't* start from the right place, given the design goals, and also one can hardly fault Karl or anyone for not having the insight that led to Git in the first place. You cannot force insights.
Pah. CVS was perfectly usable. So was RCS. Even SCCS wasn't bad. Frankly, I'm a bit dubious about any developer who had problems with CVS.
Of the various VCSes I've used, PVCS was something of a nightmare, but even it could be beaten into submission. This was the old Polytron-Intersolv-MF-Merant PVCS for Windows and UNIX; my understanding is that the current Dimensions CM product, from Serena Software and now from Micro Focus again, is a completely different beast, despite the occasional convergence of product names.
(I haven't used Dimensions, and I have only a glancing acquaintance with our other commercial VCS products, StarTeam and AccuRev. I've heard that Dimensions is particularly good for things like code review, and AccuRev has an interesting take on branching and merging, and I know StarTeam has a relational back end. But I don't have any real experience with them.)
Try doing a merge across 50,000 files in CVS as I frequently had to do in the past. The basic procedure was:
1. Tag every single file in the head prior to your merge. Takes maybe 30-40 minutes because every action is individual to the file.
2. Do the merge from your branch onto the mainline which involves diffing every single tagged file against the file's branch. You're honestly best off to use a visual tool like Eclipse Team Sync so you do not go insane. Takes another 30-40 minutes even without many conflicts to resolve.
3. Commit all the changes.
4. Tag every single file in the head post the merge. Takes another 30-40 minutes just so you have the ability to do a diff of before and after for whatever reason, such as reverting the change.
Total time for one merge could be 2 or 3 hours depending on complications such as somebody else checking in at the same time. Normally I'd email people before and after and hope they stayed away.
And of course creating the branch in the first place was expensive so people had a tendency to avoid doing it at all.
CVS may have been better than some other source control systems doing the rounds in its day (and it was free) but it still sucked. Subversion is hands down better than CVS but it's still way slower and network / server intensive at branching and merging than a DVCS like Git or Mercurial.
My first foray into SCM in the mid to late 90's was when we started to use PVCS. Cue lots of devs checking out code and then going on holiday rendering no one else to do any work on said code. Subversion was my go to in the late noughties. I persuaded several clients to migrate from Visual Source Safe to SVN and Mantis at the front-end for feature/bug tracking for a reasonably integrated solution and it worked very well. Git was a game-changer for me though (once I'd got my head round the whole concept) and I don't think I've touched SVN since.
I still have nightmares about PVCS.
- UI doesn't tell you about newly created files. Broken builds were usually due to someone forgetting to add a file to the repository.
- BOFH added a script to the "update" action. Getting other people's work caused 100s of CMD windows to pop up and close immediately.
- Server remembers the file location. If one user decides to move their workspace to their D: drive, it moves for all other developers as well.
Oh Lord, yes. We used it for years, and then we bought the damn thing.1
Under the covers PVCS was basically a distributed VCS with an engine either based on or inspired by RCS; if you went to the servers and hacked the repository files, they were very similar to RCS files. Then it had some additional functionality awkwardly tacked on, and that unfortunate GUI.
Cue lots of devs checking out code and then going on holiday
Yep. The lock-for-revision, check-out/edit/check-in workflow was inherited from RCS, and had the same problem that RCS had: sometimes you needed to break a lock.
1Well, technically Micro Focus and Intersolv merged, and then Intersolv took over, in a manner reminiscent of Invasion of the Body Snatchers or perhaps The Thing.
SVN has been a very good repository tool for us. Most of the people I administer for have at least the concept of folders and I can assign R/W access per user to keep them off each others toes. We do not need real co-development, just post occasionally from different machines. If you look at what most non-tech people are using out there it is things like "dropbox" : two users do file access simultaneously and the last post wins, file post has a random delay, file locks happen and files disappear when users run programs. SVN has saved me a lot of trouble by replacing the use of simple file sharing tools.
The version control is an added benefit that I use as a programmer for hardware and software development. There are a lot of non-text files that must be managed and SVN is very good for saving things.
Finally, the use of HTTP for SVN under Apache2 gives my users a simple way to access the data, most use Tortoisesvn.
Biting the hand that feeds IT © 1998–2020