user4427511
user4427511

Reputation:

Get size of removed lines

Consider this script:

mik() {
  nov=
  while [ $((nov+=1)) -le $1 ]
  do
    echo $RANDOM
  done
}
mik 200 > osc.txt
git add .
git commit -m pap
{
  head -100 osc.txt
  mik 50
} > que.txt
mv que.txt osc.txt

This commits a file of 200 random lines, then removes last 100 lines, then adds 50 new random lines. I would like to get the size in bytes of the removed lines. I tried this command:

$ git diff-index --numstat @
50      100     osc.txt

However it just shows the number of lines added and removed, not bytes.

Upvotes: 2

Views: 151

Answers (3)

jthill
jthill

Reputation: 60393

git diff @ osc.txt | git apply --no-add --cached

will apply only the deletions you've done to your worktree copy, and apply only into the indexed state, so you can then

git cat-file -s @:osc.txt  # size of committed version
git cat-file -s :osc.txt   # size of indexed version, with only worktree removals applied
wc -c osc.txt              # size of worktree version

you can then

git reset @ -- osc.txt

to reset the indexed state.

Upvotes: 1

user4427511
user4427511

Reputation:

sed:

git diff | sed '/^i/N;s/^-//;t;d' | wc -c

awk:

git diff | awk '/^i/{getline;next}/^-/{q+=length}END{print q}'
  1. Print diff

  2. Filter out --- lines

  3. Filter in removed lines

  4. Remove beginning -

  5. Count total number of bytes

Upvotes: 4

CodeWizard
CodeWizard

Reputation: 142342

git diff will show you the number of lined removed or added.
Use awk, sed or any other unix command to extract the data from the input

--shortstat is what you want:

git diff --shortstat commit1 commit2

git cat-file -s will output the size in bytes of an object in git.
git diff-tree can tell you the differences between one tree and another.

Putting this together into a script called git-file-size-diff.

We can try something like the following:

#!/bin/sh

args=$(git rev-parse --sq "$@")

# the diff-tree will output line like:
# :040000 040000 4...acd0 fd...94 M main.webapp

# parse the parameters form the diff-tree
eval "git diff-tree -r $args" | {
  total=0

  # read all the above params as described in thi sline:
  # :040000 040000 4...acd0 fd...94 M   main.webapp
  while read A B C D M P
  do
    case $M in
      # modified file
      M) bytes=$(( $(git cat-file -s $D) - $(git cat-file -s $C) )) ;;

      # Added file
      A) bytes=$(git cat-file -s $D) ;;

      # deleted file
      D) bytes=-$(git cat-file -s $C) ;;
      *)

      # Error - no file status found
      echo >&2 warning: unhandled mode $M in \"$A $B $C $D $M $P\"
      continue
      ;;

    # close the case statment
    esac

    # sum the total bytes so far
    total=$(( $total + $bytes ))

    # print out the (bytes) & the name of the file ($P)
    printf '%d\t%s\n' $bytes "$P"
  done

  # print out the grand total
  echo total $total
}

In use this looks like the following:

$ git file-size-diff HEAD~850..HEAD~845
-234   a.txt
112    folder/file.txt
-4     README.md
28     b.txt
total -98

By using git-rev-parse it should accept all the usual ways of specifying commit ranges.

Note:
that bash runs the while read in a subshell, hence the additional curly braces to avoid losing the total when the subshell exits.

Upvotes: 0

Related Questions