airborne
airborne

Reputation: 4094

git diff: ignore deletion or insertion of certain regex

I'm trying to use git diff to find differences between two version of a certain file type if the following expression has been added or deleted:

(****)

According to the git diff Documentation the parameter -G is what I'm looking for. So I tried the following:

git diff -G '\(\*\*\*\*\)' -- *.fileEnding

Unfortunately it doesn't work and all other differences in the files are returned too. I'm not very familiar with regexs btw.

EDIT: I think need to be a little more specific about my issues: Right now I have the following case: One file has changes which match the regex and changes which doesn't. In my script I'm trying to do something like this (Pseudocode):

if((git diff -G '\(\*\*\*\*\)' -- *.fileEnding)==(git diff -- *.fileEnding)) print "Only changes in (****)";

It works fine if some files only have changes in (****) and other file have different changes. But it doesn't work as soon one file has both

Upvotes: 0

Views: 1333

Answers (2)

mlncn
mlncn

Reputation: 3366

The accepted answer is wrong in saying that -G does not work for git diff. It most definitely does:

git diff -G '\(\*\*\*\*\)'

will, out of hundreds of changed files, return only the file with a (****) as one of its changes.

The original poster (i think) acknowledged this, but wanted to not see other changes in the same file. It's just worth clarifying that git diff does have -G and that it can do the hard part (tested with git 2.20.1). Using -G to only show changed files and using grep to show only the wanted part within those files would make for very few false positives:

git diff -G '\(\*\*\*\*\)' -- *yml | grep -n1 '(\*\*\*\*)'

Will show only lines with the text (****) changed, and a bit of context (enough to see what it was changed to).

See also https://stackoverflow.com/a/53471974/1028376 for use of -G with diff.

Upvotes: 1

torek
torek

Reputation: 487725

TL;DR

Git's git diff just doesn't do that.

Long explanation

The documentation is misleading.

The -G argument does absolutely nothing for git diff. Instead, -G is actually an argument to git log (and its sister command, git rev-list, and any command that invokes these others; but best to just think of it in terms of git log, I think).

The git diff and git log commands share some of their documentation (reasonably enough—they share some of their code, specifically the diff-generating code that git log uses to compare any one commit to its parent(s)).

When git log is selecting commits, you can tell it to select some specific commits (out of the selection already made by revision specifiers). The -G argument is one such selector, as is the very similar -S argument. -S takes a string, rather than a regular expression, by default; but you can add --pickaxe-regex to make -S take a regular expression. The documentation has an example, and the example literally refers directly to git log:

-G<regex>

Look for differences whose patch text contains added/removed lines that match <regex>.

To illustrate the difference between -S<regex> --pickaxe-regex and -G<regex>, consider a commit with the following diff in the same file:

+    return !regexec(regexp, two->ptr, 1, &regmatch, 0);
...
-    hit = !regexec(regexp, mf2.ptr, 1, &regmatch, 0);

While git log -G"regexec\(regexp" will show this commit,
git log -S"regexec\(regexp" --pickaxe-regex will not (because the number of occurrences of that string did not change).

The git diff command, though, is not the git log command.

The way git log works, in general, is that you pass it a starting commit—a hash ID, for instance, or a branch or tag name—and it:

  • shows you that commit, then
  • shows you the parent of that commit, then
  • shows you the parent of the parent ...

and so on, all the way back to the very first commit (or when you get tired of looking and quit out of the pager). In other words, at least for these simple cases, there is a loop:

while (there is a commit $commit)
    parent = resolve_hash_id($commit + "^")
    show($commit)
    commit = $parent

If you add -p, the "show" step includes the output of git diff $parent $commit.

Note that git diff compares exactly two commits (well, there is a special kind of diff called a combined diff for merge commits, but git log does not show them by default, and normal usage of git diff doesn't either). The two commits for git log are the parent and the child. If you run git diff yourself, you can pick any two arbitrary commits ... but when you do that, git diff completely ignores any -G or -S arguments. (It probably should complain about their presence.)

The point of -G and -S is to affect the normal behavior of git log. Often, when we're looking at commits one at a time, in sequence, the way git log does, we're more interested in a particular change to a particular file (or set of files). We can use -G or -S to tell git log: Generate the diff, but then if it doesn't have the change, don't show the commit at all. That way we see only those commits that have those changes. (Adding file names, e.g., git log stop..start -- path/to/file1.txt, limits the diff to those files as well. Unlike -S and -G, that part does work with git diff.)

What you can do

If you don't know which revisions you want, you can use git log (or its script-oriented sister command git rev-list) to screen candidates. Here, you can use -G. You don't have to get the diff now, though you can if you want. If -G not only gets you the possible candidates, but in fact gets you the right ones, you are done and you can stop here.

If you still have too many candidates and need full diffs to whittle them down further, you can now run git diff on the hash IDs obtained by the first step (git log or git rev-list). For each of these commits, you must choose which commit to compare it to: perhaps some commit from the candidate list, or perhaps a commit just before or just after this particular candidate. Now you no longer have the -G tool: to search the diff, you will need some external searching tool, such as grep. It's up to you to write this part.

Upvotes: 2

Related Questions