max152
max152

Reputation: 545

git - getting ALL previous version of a specific file/folder

I want to retrieve all previous version of a specific file in a git repository.

I see it is possible to get one specific version with the checkout command, but I want them all. And the git clone command with the depth option doesn't seem to allow me to clone subfolder ("not valid repository name").

Do you know if it is possible and how?

Thank you

Upvotes: 41

Views: 17295

Answers (7)

Nik
Nik

Reputation: 67

    #!/bin/sh

set -e

if ! git rev-parse --show-toplevel >/dev/null 2>&1 ; then
    echo "Error: you must run this from within a git working directory" >&2
    exit 1
fi

FILE_PATH=""
EXPORT_TO=""

FILE_NAME="$(basename "$FILE_PATH")"

if [ ! -d "$EXPORT_TO" ]; then
    echo "Creating directory '$EXPORT_TO'"
    mkdir -p "$EXPORT_TO"
fi

echo "Writing files to '$EXPORT_TO'"

# Get all commit hashes for the file
COMMITS=$(git log --pretty=format:%H -- $FILE_PATH)

# Loop over each commit
for COMMIT in $COMMITS
do
  # Checkout the file at the commit
  git checkout $COMMIT $FILE_PATH

  # Copy the file to a new location with the commit hash in the name
  cp $FILE_PATH "$EXPORT_TO/$FILE_NAME.$COMMIT.yaml"
done

# Reset the file to the latest commit
git checkout HEAD $FILE_PATH

Give this a go it is more easier to use

Upvotes: 0

Dmitry Shevkoplyas
Dmitry Shevkoplyas

Reputation: 6341

OP wanted to retrieve all versions, but the answers would not deliver. Especially if the file has hundreds of revisions (all suggestions are too manual). The only half-working solution was proposed by @Tobias in the comments, but suggested bash loop would build files in random order as well as it generates hundreds of empty files when used against our repos. One of the reasons was that "rev-list --all --objects" would list different objects (trees included - but useless for our purpose).

I started with Tobias's solution, added counters, clean up a bit and end up reinventing the wheel in form of the bash script listed below.

The script would:

  • extract all file versions to /tmp/all_versions_exported
  • take 1 argument - relative path to the file inside git repo
  • give result filenames numeric prefix (sortable)
  • mention inspected filename in result files (to tell apples apart from oranges:)
  • mention commit date in the result filename (see output example below)
  • not create empty result files

cat /usr/local/bin/git_export_all_file_versions

#!/bin/bash

# we'll write all git versions of the file to this folder:
EXPORT_TO=/tmp/all_versions_exported

# take relative path to the file to inspect
GIT_PATH_TO_FILE=$1

# ---------------- don't edit below this line --------------

USAGE="Please cd to the root of your git proj and specify path to file you with to inspect (example: $0 some/path/to/file)"

# check if got argument
if [ "${GIT_PATH_TO_FILE}" == "" ]; then
    echo "error: no arguments given. ${USAGE}" >&2
    exit 1
fi

# check if file exist
if [ ! -f ${GIT_PATH_TO_FILE} ]; then
    echo "error: File '${GIT_PATH_TO_FILE}' does not exist. ${USAGE}" >&2
    exit 1
fi

# extract just a filename from given relative path (will be used in result file names)
GIT_SHORT_FILENAME=$(basename $GIT_PATH_TO_FILE)

# create folder to store all revisions of the file
if [ ! -d ${EXPORT_TO} ]; then
    echo "creating folder: ${EXPORT_TO}"
    mkdir ${EXPORT_TO}
fi

## uncomment next line to clear export folder each time you run script
#rm ${EXPORT_TO}/*

# reset coutner
COUNT=0

# iterate all revisions
git rev-list --all --objects -- ${GIT_PATH_TO_FILE} | \
    cut -d ' ' -f1 | \
while read h; do \
     COUNT=$((COUNT + 1)); \
     COUNT_PRETTY=$(printf "%04d" $COUNT); \
     COMMIT_DATE=`git show $h | head -3 | grep 'Date:' | awk '{print $4"-"$3"-"$6}'`; \
     if [ "${COMMIT_DATE}" != "" ]; then \
         git cat-file -p ${h}:${GIT_PATH_TO_FILE} > ${EXPORT_TO}/${COUNT_PRETTY}.${COMMIT_DATE}.${h}.${GIT_SHORT_FILENAME};\
     fi;\
done    

# return success code
echo "result stored to ${EXPORT_TO}"
exit 0

Usage example:
cd /home/myname/my-git-repo

git_export_all_file_versions docs/howto/readme.txt
    result stored to /tmp/all_versions_exported

ls /tmp/all_versions_exported
    0001.17-Oct-2016.ee0a1880ab815fd8f67bc4299780fc0b34f27b30.readme.txt
    0002.3-Oct-2016.d305158b94bedabb758ff1bb5e1ad74ed7ccd2c3.readme.txt
    0003.29-Sep-2016.7414a3de62529bfdd3cb1dd20ebc1a977793102f.readme.txt
    0004.28-Sep-2016.604cc0a34ec689606f7d3b2b5bbced1eece7483d.readme.txt
    0005.28-Sep-2016.198043c219c81d776c6d8a20e4f36bd6d8a57825.readme.txt
    0006.9-Sep-2016.5aea5191d4b86aec416b031cb84c2b78603a8b0f.readme.txt
    <and so on and on . . .>

Note #1: if you see errors like this:

fatal: Not a valid object name
3e93eba38b31b8b81905ceaa95eb47bbaed46494:readme.txt

it means you've started the script not from the root folder of your git project.

Note #2: if you want to get all versions of the file that was deleted few commits ago you will have to switch to any of the old commits where that file was present (not yet deleted) by command:

git checkout OLD_HASH_WHERE_FILE_EXISTED
git_export_all_file_versions path/to/existing/file.ext

Otherwise it will error out "file does not exist". You don't have to switch to the very last commit where the deleted file was last seen, instead it can be any old commit where the file was there and then "git_export_all_file_versions" will extract all versions (even from "future" commits relative to the old commit you switched to).

Upvotes: 50

Nathan Arthur
Nathan Arthur

Reputation: 1182

The script provided by Dmitry does actually solve the problem, but it had a few issues that led me to adapt it to be more suitable for my needs. Specifically:

  1. The use of git show broke because of my default date-format settings.
  2. I wanted the results sorted in date order, not reverse-date order.
  3. I wanted to be able to run it against a file that had been deleted from the repo.
  4. I didn't want all revisions on all branches; I just wanted the revisions reachable from HEAD.
  5. I wanted it to error if it wasn't in a git repo.
  6. I didn't want to have to edit the script to adjust certain options.
  7. The way it worked was inefficient.
  8. I didn't need the numbering in the output filenames. (A suitably-formatted date serves the same purpose.)
  9. I wanted safer "paths with spaces" handling

You can see the latest version of my modifications in my github repo or here's the version as of this writing:

#!/bin/sh
    
# based on script provided by Dmitry Shevkoplyas at http://stackoverflow.com/questions/12850030/git-getting-all-previous-version-of-a-specific-file-folder

set -e

if ! git rev-parse --show-toplevel >/dev/null 2>&1 ; then
    echo "Error: you must run this from within a git working directory" >&2
    exit 1
fi

if [ "$#" -lt 1 ] || [ "$#" -gt 2 ]; then
    echo "Usage: $0 <relative path to file> [<output directory>]" >&2
    exit 2
fi

FILE_PATH="$1"

EXPORT_TO=/tmp/all_versions_exported
if [ -n "$2" ]; then
    EXPORT_TO="$2"
fi

FILE_NAME="$(basename "$FILE_PATH")"

if [ ! -d "$EXPORT_TO" ]; then
    echo "Creating directory '$EXPORT_TO'"
    mkdir -p "$EXPORT_TO"
fi

echo "Writing files to '$EXPORT_TO'"
git log --diff-filter=d --date-order --reverse --format="%ad %H" --date=iso-strict "$FILE_PATH" | grep -v '^commit' | \
    while read LINE; do \
        COMMIT_DATE=`echo $LINE | cut -d ' ' -f 1`; \
        COMMIT_SHA=`echo $LINE | cut -d ' ' -f 2`; \
        printf '.' ; \
        git cat-file -p "$COMMIT_SHA:$FILE_PATH" > "$EXPORT_TO/$COMMIT_DATE.$COMMIT_SHA.$FILE_NAME" ; \
    done
echo

exit 0

An example of the output:

$ git_export_all_file_versions bin/git_export_all_file_versions /tmp/stackoverflow/demo
Creating directory '/tmp/stackoverflow/demo'
Writing files to '/tmp/stackoverflow/demo'
...

$ ls -1 /tmp/stackoverflow/demo/
2017-05-02T15:52:52-04:00.c72640ed968885c3cc86812a2e1aabfbc2bc3b2a.git_export_all_file_versions
2017-05-02T16:58:56-04:00.bbbcff388d6f75572089964e3dc8d65a3bdf7817.git_export_all_file_versions
2017-05-02T17:05:50-04:00.67cbdeab97cd62813cec58d8e16d7c386c7dae86.git_export_all_file_versions

Upvotes: 37

gview
gview

Reputation: 15371

All the versions of a file are already in the git repo when you git clone it. You can create branches associated with the checkout of a particular commit:

git checkout -b branchname {commit#}

This might suffice for a quick and dirty manual comparison of changes:

  • checkout to branches
  • Copy to an editor buffer

This might be ok, if you only have a few versions to be concerned with and don't mind a bit of manual, albeit git built-in commands.

For scripted solutions, there are already a couple of other solutions that were provided in other answers.

Upvotes: -2

rb-
rb-

Reputation: 2365

Sometimes old versions of a file are only available through git reflog. I recently had a situation where I needed to dig through all the commits, even ones that were no longer part of the log because of an accidental overwriting during interactive rebasing.

I wrote this Ruby script to output all the previous versions of the file to find the orphaned commit. It was easy enough to grep the output of this to track down my missing file. Hope it helps someone.

#!/usr/bin/env ruby
path_to_file = ""
`git reflog`.split("\n").each do |log|
   puts commit = log.split(" ").first
   puts `git show #{commit}:#{path_to_file}`
   puts
 end

The same thing could be done with git log.

Upvotes: 0

sehe
sehe

Reputation: 393174

git rev-list --all --objects -- path/to/file.txt

lists you all the blobs associated with the repo path

To get a specific version of a file

git cat-file -p commitid:path/to/file.txt

(commitid can be anything

  • symbolic ref (branch, tag names; remote too)
  • a commit hash
  • a revision spec like HEAD~3, branch1@{4} etc.

Upvotes: 10

Related Questions