Reputation: 23633
I'm splitting off part of a git repo to create a new repo, and am trying to use git filter-branch
to maintain the history of the files that are being moved to the new project. I know about --subdirectory-filter
but this is not a good solution because the files I'm pulling out don't map cleanly to one subdirectory. The best option I've found so far is --index-filter
, used as follows:
git filter-branch -f --index-filter 'git read-tree --empty && git reset -q "${GIT_COMMIT}" -- <list of files>' --prune-empty -f
This seems to work, except I'd like to be able to programmatically generate the list of files to keep so I can iteratively refine this list. I'm currently trying to get a list of the files I want to keep in another file, and append this to the string representing the command to be executed for each commit as follows:
tmp=$(cat ~/to_keep.txt) && git filter-branch -f --index-filter 'git read-tree --empty && git reset -q "${GIT_COMMIT}" -- '$tmp --prune-empty -f
Unfortunately, this results in
fatal: bad flag '--prune-empty' used after filename
Even just echoing the files seems to cause trouble:
tmp=$(echo a.txt b.txt) && git filter-branch -f --index-filter 'git read-tree --empty && git reset -q "${GIT_COMMIT}" -- '$tmp --prune-empty -f
fatal: ambiguous argument 'b.txt': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
I've also tried concatenating the strings earlier:
tmp1=$(echo a.txt b.txt) && tmp2='git read-tree --empty && git reset -q "${GIT_COMMIT}" -- ' && tmp3=${tmp2}${tmp1} && git filter-branch -f --index-filter $tmp3 --prune-empty -f
fatal: ambiguous argument 'read-tree': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
I assume this is just concatenation not happening as I expect in the shell. Does anyone know how I can make this work? It would be great if you could explain what these errors mean, as well. Thanks.
Upvotes: 2
Views: 745
Reputation: 487883
Each argument to the various ...-filter
s needs to be a single string. That string is saved as a shell variable:
--index-filter)
filter_index="$OPTARG"
;;
At the appropriate point, the filter-branch script (found in the git-core
subdirectory, e.g., /usr/libexec/git-core
or /usr/local/libexec/git-core
) does this:
eval "$filter_index" < /dev/null ||
die "index filter failed: $filter_index"
(except for the commit-filter which is run with /bin/sh -c "$filter_commit" ...
).
Your assumption is thus correct, and what you need is to make the list of files be part of a single, white-space-separated string.
The easiest way to do this would be to start with your original command:
git filter-branch -f --index-filter \
'git read-tree --empty && git reset -q "${GIT_COMMIT}" -- <list of files>' \
--prune-empty -f
(which works when you have a static list) and modify it to extract the dynamic list from ~/to_keep.txt
. I split the original into three lines partly for display purposes, but also because we can now concentrate just on the middle line.
[Edit to fix newline issue noted in comment. Let's make an alias or shell function, xc
, that translates newlines to spaces]
xc() {
tr '\n' ' '
}
"git read-tree --empty && git reset -q \"\${GIT_COMMIT}\" -- $(xc < ~/to_keep.txt)" \
or:
'git read-tree --empty && git reset -q "${GIT_COMMIT}" -- '"$(xc < ~/to_keep.txt)" \
or, as you attempted (but with one change):
'git read-tree --empty && git reset -q "${GIT_COMMIT}" -- '"$tmp" \
(having set tmp=$(xc < ~/to_keep.txt)
).
Note that none of this correct things if any of the file names contains white space. For instance, suppose a file is named a file
(with embedded blank). The eval
will break arguments at spaces, and the git reset
command will get the names a
and file
as two separate arguments.
As long as you don't have any such file names, you need not worry about addressing this.
One other potential problem is if this list of files gets very long. You may run into kernel limits on the number of arguments that can be sent to one file. You should be able to use xargs
to solve this (and, for that matter, with some work and use of -0
, to handle white-space in file names).
Upvotes: 3