Reputation: 20521
I'm currently releasing several projects as open source. Typically the complete source is provided as ZIP archive or checked in at an open source repo. This makes analysis by ohloh difficult.
In case the software has been developed in a non-public repository, the complete history is available. However, I do not want to have the full history released.
I want to use git for reaching one of the two possibilites:
(i) One commit per author: There should be one commit per author (with the commit date the final release date). Each commit contains the lines of code, which finally made it into the final version.
(ii) Original commits with only the final code lines: In this variant, the number of commits itself are preserved. Each commit is modified in a way that only the lines, which finally made it into the final version, are preserved and all other ones are deleted.
Has anyone implemented one of the variants yet? Variant (i) seems to be doable using git-blame and some scripting.
Upvotes: 0
Views: 410
Reputation: 20521
git-oss-releaser is a solution for option (i).
git-oss-releaser converts a given git repository to a git repository only containing the files of the last commit and commits resembling git blame
output for each file.
The original history is lost.
usage: git-oss-releaser.py [-h] repoDir outDir
Positional arguments:
repoDir
: The repository to transform. May also be a subdirectory of a repo.outDir
: The directory where the new repo should be created. Has to be empty.Optional arguments:
--name NAME
The user.name
to use for committing the files. Defaults to git's global user.name
.--email EMAIL
The user.email
to use for committing the files. Defaults to git's global user.email
.--date DATE
The date to use for commits. Defaults to the date the last commit.Note that git distinguishes author and committer at a commit.
The author is taken using git blame
, the committer data is taken from the global user.name
and user.email
or the given configured --name
and --email
.
DEBUG mode can currently only be enabled in the code.
Upvotes: 2
Reputation: 21
(i) One commit per author
I guess that's logically not possible: Suppose you've got a sequence of commits like this:
If commit C depends on anything done in B, you cannot reorder and squash A and C anymore.
(ii) Original commits with only the final code lines
For that you can use 'git filter-branch --tree-filter'. Beware that the following script might eat kittens, because I've only tested it on a simple test repository. You've been warned:
git filter-branch --prune-empty --tree-filter '
# directory which contains the final version of the project
FINAL_VERSION="$DIRECTORY_WITH_REPOSITORY_OF_FINAL_VERSION"
# directory which contains the filtered version of the repository
FILTER_DIR="$(pwd)"
# apply the current commit in to the final version in reverse mode,
# ignore the rejects
cd "$FINAL_VERSION"
git show "$GIT_COMMIT" > /tmp/original.patch
cat /tmp/original.patch | patch -p1 -t
git diff > /tmp/filtered.patch
# reset the FINAL_VERSION to the original state.
git reset --hard
git clean -f -d -x
# apply the patch which contains the lines which could be reversed on
# the filtered version
cd "$FILTER_DIR"
# revert the last commit
patch -p1 -t < /tmp/original.patch
# apply the filtered patch
patch -p1 -t < /tmp/filtered.patch
# remove the rejects by the modified patch
find -name "*.orig" -o -name "*.rej" | xargs rm -rf
' previousRelease..HEAD
(this assumes that you've tagged the branching point with "previousRelease". you also have to adapt the FINAL_VERSION variable.)
Upvotes: 2