Conversion complete

Well the full conversion ran last night for the other team.

Some stats:

  • The current four Visual Source Safe repositories had a combined size of 11.2 GB
  • The wanted code tree was 490 MB  (mix of text and binary files)
  • The subversion history dump of the above code was 4.1 GB
  • The subversion repository size is 620 MB
  • The copy/dump/load took 12 hours to run

To merge the repositories together, I had to run a few svn commands between svnadmin loads to create/move/delete so the sub-trees were all happy.

One oddity noticed was that some files were different between the old and new repositories.  This was due to the file being altered by a developer is the US in his time zone, and then within the time-zone difference, a developer in NZ changing the files also. So even though the US change was made first, the dump program sees the NZ one having the earliest time, thus swapped the order of these edits. (Because VSS is done on local time, and the local client alter the repository, so different time-zone really should not work on the same repository)

But this will not happen now, because subversion itself does not have this problem.

vss_to_svn: Dealing with spaces in VSS paths

I’ve been helping another team move from Visual Source Safe to Subversion. Today I got the list of sub folders needed/not-needed in the new repository. Something that took me a while to solve was how to include or exclude paths with spaces, as the python would see these as new argument. This is done via an argument file.

So, to include the path $/Path/Project Version ExtraWords the following will not work:

-i $/Path/Project Version ExtraWords
-i $/Path/Project Version ExtraWords
-i $/Path/Project Version ExtraWords
-i $/Path/Project Version ExtraWords

So after a long pause, I remembered the url %20 trick

-i $/Path/Project%20Version%20ExtraWords

and it worked. I hope anybody using the scripts, or python arguments finds this useful.

Moving from Visual Source Safe to Subversion

Last year we moved our current development repository from Visual Source Safe to Subversion. At a previous .Net Architecture Chat we talked about this process.

So here is the python script and DOS .bat file used to do our conversion.

This conversion only extracts the history for the current live tree. This is useful when you want to change going forward, but don’t need the complexity (and problems) or trying to move your complete history. You can add sub-path exclusions, and it may take a few iterations to get the result you want. ie Identify the dead code paths, and not included them in your new code tree.

I have been using these scripts the last few days to convert another teams repository. Their VSS repository was large (>1Mloc) and the Python run-time was crashing so I had to reduced the size of the repository been moved in one go. I decided to separate out the Libraries sub-directory, as this was large, but the changes do not really need to be in the same atomic check-ins as the main code base. Running the script over the weekend resulted in success (all be it a 10 ½ hour success).

You can merge many dump files into the same repository using svnadmin, so this can consolidate your repositories if they are fragmented like ours were.

Moving from VSS to SVN

Now that the development phase of our release cycle has end, and the last minute tweaks have also stopped, I have spent the last two weeks moving our development source control from Visual SourceunSafe to Subversion.

The first task was to dump our two main repositories into SVN dump files. I was not the first team in our group to make this move (this helped getting approval from my technical lead). I was handed the other team’s python conversion script. This was based on the state of the VSS2SVN project (a few years ago), but heavily modified to just do the required job. I had some problems running the script on our repositories, so went to see how the current project had improved.

The current VSS2SVN have changed from Python to Perl and stopped using the VSS COM API, to using their own archive reading tool. Because they can read/recover stuff the COM API could not handle. Spooky, it’s reading things like that, that gave me the energy to keep plodding when things got tough. Our smaller repository converted with only ~20 problems, files reported as missing parents when all versions could be seen, and retrieved via the VSS client. The larger (and shared with another team) repository just was unhappy.

After reading ~4 months worth of mailing list, I knew what/why things weren’t working as I wanted. The long and short of it is, the VSS2SVN project is trying to port the complete history of the project to SVN, which I could see as a good thing. But for our needs, it is not what we need, I trust VSS for what it currently has, I just don’t trust it going forward. I want the current live tree, and all the history (changes and notes) for that current tree. I don’t want the file moves or the likes.

So now knowing what I really want, I went digging, and found a Perl script doing this thing, I had trouble getting it working (eg, it didn’t work in the first few minutes), so I decided you try the Python script again. The perl was laid out really clean, but I have always had this mental pictures of Perl been spaghetti.

So I picked up the python code I was handed at the beginning of that week, and now with all my incite/understanding, I got it working in ~3 minutes. First repository dumped as sweet as you could dream. The second had issues, as some really old files didn’t have the first n versions there. You could see it, just not retrieve from the VSS archive. So I altered the code to return “This version could not be found” and then it also dumped.

I’ve spent this week sorting out of the shared files (auto within the same repository, and manually between repositories). I’m getting to the happier place. And soon will be ready to unleash it on the rest of the team. He’s looking forward to using the new tools.

We are upgrading from VS 6.0 to VS 2005 at the same time. Ah things are starting to look better all the time. I will also not have to maintain two copies of all my work. Good tools are so nice. Now to move from char* to std::string, that may be a longer term project…