kayahr
kayahr

Reputation: 22070

How to modify commit messages when converting from SVN to Git?

I'm currently mirroring an existing SVN repository on Github. For the initial clone I did this:

git svn --authors-prog=fixsvnauthors clone http://closure-compiler.googlecode.com/svn/

To update the repo every hour I do this:

git svn --authors-prog=fixsvnauthors rebase

The fixsvnauthors maps the SVN user names (E-Mail addresses) to Git usernames. My problem is now that this SVN repository seems to have a strange policy for commit messages. Most of them start with an empty line. Github doesn't like that at all. All commit message summaries are empty which is pretty annoying.

So when there is a nice way to fix the authors during cloning maybe there is also a way to fix commit messages? I simply want to trim the commit messages so they are correctly read by Github. Couldn't find anything like this in the documentation of git-svn but maybe I missed something. So how can I do this?

Upvotes: 3

Views: 1594

Answers (3)

Steve Kero
Steve Kero

Reputation: 713

You also can replace empty messages after import has finished.

git filter-branch --msg-filter '<path>\SetDefaultMessage.pl'

where SetDefaultMessage.pl is

#!/usr/bin/perl

my $data = "";    
while(<STDIN>) {
    $data .= $_;
}

if($data =~ /^\s*$/) { $data="-\n"; }
print "$data";

Upvotes: 0

kayahr
kayahr

Reputation: 22070

I created a Git patch to implement a --messages-prog parameter in git-svn which can be used to specify a program to filter the commit messages while pulling changes from SVN. Works great for me. I sent the patch to the git mailing list but never got any reaction. Well, maybe the patch is useful for someone, so I post it here:

From: Klaus Reimer <[email protected]>
Date: Sat, 26 May 2012 17:56:42 +0200
Subject: [PATCH] Implement --messages-prog parameter in git-svn

Some SVN repositories have strange policies for commit messages requiring an
empty line at the top of the commit message.  When you clone these
repositories with Git to mirror them on GitHub then no commit message
summaries are displayed at all at GitHub because they use the first line for
it (Which is empty).  You always have to open the commit message details
instead which is pretty annoying.  With the --messages-prog parameter you
can specify a program which can modify the SVN commit message before
committing it into the Git repo.  This works like the --authors-prog
parameter with the only difference that the commit message is piped into the
specified program instead of being passed to it as a command-line argument.

The same could be achieved by a "trim" feature but specifying a program
which can modify any aspect of a commit message is much more flexible.

Signed-off-by: Klaus Reimer <[email protected]>
---
 Documentation/git-svn.txt        |  5 +++++
 git-svn.perl                     | 26 +++++++++++++++++++++++++-
 t/t9147-git-svn-messages-prog.sh | 40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100755 t/t9147-git-svn-messages-prog.sh

diff --git a/Documentation/git-svn.txt b/Documentation/git-svn.txt
index cfe8d2b..7289246 100644
--- a/Documentation/git-svn.txt
+++ b/Documentation/git-svn.txt
@@ -546,6 +546,11 @@ config key: svn.authorsfile
    expected to return a single line of the form "Name <email>",
    which will be treated as if included in the authors file.

+--messages-prog=<filename>::
+   If this option is specified, each SVN commit message is piped
+   through the given program. The output of this program is then
+   used as the new commit message instead.
+
 -q::
 --quiet::
    Make 'git svn' less verbose. Specify a second time to make it
diff --git a/git-svn.perl b/git-svn.perl
index c84842f..514c888 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -6,7 +6,7 @@ use warnings;
 use strict;
 use vars qw/   $AUTHOR $VERSION
        $sha1 $sha1_short $_revision $_repository
-       $_q $_authors $_authors_prog %users/;
+       $_q $_authors $_authors_prog $_messages_prog %users/;
 $AUTHOR = 'Eric Wong <[email protected]>';
 $VERSION = '@@GIT_VERSION@@';

@@ -120,6 +120,7 @@ my %remote_opts = ( 'username=s' => \$Git::SVN::Prompt::_username,
 my %fc_opts = ( 'follow-parent|follow!' => \$Git::SVN::_follow_parent,
        'authors-file|A=s' => \$_authors,
        'authors-prog=s' => \$_authors_prog,
+       'messages-prog=s' => \$_messages_prog,
        'repack:i' => \$Git::SVN::_repack,
        'noMetadata' => \$Git::SVN::_no_metadata,
        'useSvmProps' => \$Git::SVN::_use_svm_props,
@@ -359,6 +360,9 @@ load_authors() if $_authors;
 if (defined $_authors_prog) {
    $_authors_prog = "'" . File::Spec->rel2abs($_authors_prog) . "'";
 }
+if (defined $_messages_prog) {
+   $_messages_prog = "'" . File::Spec->rel2abs($_messages_prog) . "'";
+}

 unless ($cmd =~ /^(?:clone|init|multi-init|commit-diff)$/) {
    Git::SVN::Migration::migration_check();
@@ -2051,6 +2055,7 @@ use vars qw/$default_repo_id $default_ref_id $_no_metadata $_follow_parent
 use Carp qw/croak/;
 use File::Path qw/mkpath/;
 use File::Copy qw/copy/;
+use IPC::Open2;
 use IPC::Open3;
 use Time::Local;
 use Memoize;  # core since 5.8.0, Jul 2002
@@ -3409,6 +3414,22 @@ sub other_gs {
    $gs
 }

+sub call_messages_prog {
+   my ($orig_message) = @_;
+   my ($pid, $in, $out);
+   
+   $pid = open2($in, $out, $::_messages_prog)  
+       or die "$::_messages_prog failed with exit code $?\n";
+   print $out $orig_message;
+   close($out);
+   my ($message) = "";
+   while (<$in>) {
+       $message .= $_;
+   }
+   close($in);
+   return $message;    
+}
+
 sub call_authors_prog {
    my ($orig_author) = @_;
    $orig_author = command_oneline('rev-parse', '--sq-quote', $orig_author);
@@ -3809,6 +3830,9 @@ sub make_log_entry {

    $log_entry{date} = parse_svn_date($log_entry{date});
    $log_entry{log} .= "\n";
+   if (defined $::_messages_prog) {
+       $log_entry{log} = call_messages_prog($log_entry{log});
+   }
    my $author = $log_entry{author} = check_author($log_entry{author});
    my ($name, $email) = defined $::users{$author} ? @{$::users{$author}}
                               : ($author, undef);
diff --git a/t/t9147-git-svn-messages-prog.sh b/t/t9147-git-svn-messages-prog.sh
new file mode 100755
index 0000000..ebb42b0
--- /dev/null
+++ b/t/t9147-git-svn-messages-prog.sh
@@ -0,0 +1,40 @@
+#!/bin/sh
+
+test_description='git svn messages prog tests'
+
+. ./lib-git-svn.sh
+
+cat > svn-messages-prog <<'EOF'
+#!/bin/sh
+sed s/foo/bar/g
+EOF
+chmod +x svn-messages-prog
+
+test_expect_success 'setup svnrepo' '
+   svn mkdir -m "Unchanged message" "$svnrepo"/a
+   svn mkdir -m "Changed message: foo" "$svnrepo"/b
+   '
+
+test_expect_success 'import messages with prog' '
+   git svn clone --messages-prog=./svn-messages-prog \
+       "$svnrepo" x
+   '
+
+test_expect_success 'imported 2 revisions successfully' '
+   (
+       cd x
+       test "`git rev-list refs/remotes/git-svn | wc -l`" -eq 2
+   )
+   '
+
+test_expect_success 'messages-prog ran correctly' '
+   (
+       cd x
+       git rev-list -1 --pretty=raw refs/remotes/git-svn~1 | \
+         grep "^    Unchanged message" &&
+       git rev-list -1 --pretty=raw refs/remotes/git-svn~0 | \
+         grep "^    Changed message: bar"
+   )
+   '
+
+test_done
-- 1.7.10.2.605.gbefc5ed.dirty 

Upvotes: 1

VonC
VonC

Reputation: 1329102

I didn't find solution allowing you to preserve the git-svn link after the first import.


For a one-shot import, I believe it would be best to keep that kind of fix as a separate step, done after the git-svn import.
In theory, git grafts could be used to attached subsequent git-svn import from the original git repo to your fixed git repo.
However a git svn dcommit (export back to the svn repo) isn't likely to work.

To be sure to treat and fix all the commits from all the branches, you could use a

git filter-branch -f --msg-filter 'yourScript.sh' --tag-name-filter cat -- --all

As mentioned in "Rewriting git commit message history across multiple branches"

filter-branch is very powerful and has many options.
In this case, I only wanted to rewrite a commit message, so I used the --msg-filter option: This pipes each message to a shell command and replaces it with the output of that command.
This method, unlike rebasing onto a single edited commit, lets you programmatically edit all commit messages.

If you have any tags, you should add --tag-name-filter cat.
This updates the tags to point to the modified commits. It's more complicated if the tags are signed.

Upvotes: 0

Related Questions