Fixing Automated Import of icu4c from Subversion into Git

Several years ago we moved to a completely git based workflow. When we had to make some changes to icu4c which uses Subversion as their VCS we wanted to integrate that with our regular workflow, so we used git-svn to import the changes from Subversion into Git. We also set up an automated task that regularly pulls in new changes.

This worked great until last November when they decided to reorganize the source tree: the icu4c sources are now under trunk/icu4c instead of icu/trunk. So I worked on fixing our automated import process in a way that keeps the existing history intact. One problem I ran into was a bug in libsvn-perl/libsvn1 that caused git svn fetch to die with signal 11. I eventually had to run the import on Ubuntu Zesty which comes with version 1.9.5 of libsvn-perl/libsvn1 which contains a fix.

Here are the steps that I used to fix the automated import and have the new commits appear on top of the previously imported ones. While researching how to do that I came accross a github repo where a ICU team member experimented with importing the source tree into git. This experimental repo had an authors.txt file mapping the subversion users to email addresses and full names. I used that file for the import.

Preqrequisite for succesful import: libsvn-perl and libsvn1 >= 1.9.5

The mentioned files and scripts can be downloaded from a gist.

  • create a new clone of the subversion repo:
    git svn clone --authors-file=authors.txt \ --trunk=trunk/icu4c \
        --branches=branches/maint/*/icu4c --tags=tags/*/icu4c \
        --prefix=svn/ -r 1 \ icu4c-svn
  • cd into that directory and add a few settings to the git configuration ( gets called by git-svn for any svn user that doesn’t show up in authors.txt. It’ll simply create a made up email address):
    cd icu4c-svn
    git config --add svn.authorsProg $(pwd)/../
    git config --add svn-remote.svn.fetch \
    git config --add svn-remote.svn.fetch \
  • Import from svn. Revisions around 39494 deal with the reorganization, so I decided to start the import there. I also decided to leave some of the newest changes so that later on I can verify that everything works.
    for i in $(seq 39494 40100); do
        echo "Processing $i"
        git svn fetch -r $i 2>&1
    done | tee /tmp/output
  • add existing repo as remote origin and fetch old history
    git remote add origin
    git fetch origin
  • create git tag for every svn/tags
    for b in $(git branch -r | grep svn/tags); do
        echo "Creating tag $tag"
        git tag -f $tag $b
  • rebase svn branches on corresponding branch in origin and call filter-branch to connect old and new history (rebicu2 script)
  • Create branches for all svn branches ( script). This is necessary so that in the next step all branches are included in the new clone.
  • create a second clone from first one:
    git clone --origin svn icu4c-svn icu4c
  • copy config from first repo
    cd icu4c
    cp ../icu4c-svn/.git/config .git/
  • run git svn fetch, limiting the range to older versions. This will recreate the rev_map file with the new commit shas. Things seem to go faster when we limit the version range for recreating the mappings file.
    git svn fetch -r 39400:39494
    git svn fetch -r 39400:40200
  • to verify that things work as expected, run
    git svn fetch

The final result can be seen in


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s