James Johnston
James Johnston

Reputation: 9492

Include submodule commit messages with "git log"

Suppose I have two versions in my repository... each has been tagged as follows:

Now suppose that a commit updated a submodule reference to point to a new submodule commit between Tag1 and Tag2. I run the following command, and get this:

# show commits between these two tags
git log Tag1..Tag2


commit be3d0357b93322f472e8f03285cb3e1e0592eabd
Author: James Johnston <snip>
Date:   Wed Jan 25 19:42:56 2012 +0000

    Updated submodule references.

In this case, the only change was an update of the submodule. How do I get the submodule commits to be interleaved with the parent repository commits?

Specifically, in this example, suppose that the parent repository points to the SubTag5 tag in the submodule. Two commits later in the submodule is a SubTag6 tag. The commit shown updated the submodule pointer to point to SubTag6 instead of SubTag5. What I would like to do is have git log, in addition to the commit it already printed, print the two submodule commits as well that brought the submodule from SubTag5 to SubTag6.

Upvotes: 23

Views: 19060

Answers (5)

Kitiara
Kitiara

Reputation: 498

Unfortunately the submodule logs don't show up as standard git logs and --format is out of option. I searched for a tree style solution can that only show commit hash and message for all submodules in nested form. But i couldn't find any so i have created a powershell script which does that exactly.

$gitModulesCache = @{}

function Get-ChildLog
{
    param(
        [string]$hash,
        [string]$repositoryPath = (Get-Location)
    )

    $log = @()

    $gitDir = "--git-dir=" + $repositoryPath + "/.git"
    foreach ($files in (git $gitDir log -1 $hash --name-only --format="")) 
    {
        if (($files.Length -ne 0) -and $gitModulesCache.Contains($repositoryPath) -and $gitModulesCache[$repositoryPath].Contains($files)) 
        {
            $pattern = "Submodule $files ((\w+)\.\.(\w+)):" # Submodule hash range pattern
            $match = ((git $gitDir log $hash -p --submodule=log -1 --oneline) | Select-String -Pattern $pattern)
            if ($match) 
            {
                $newhash = $match.Matches.Groups[2].Value + ".." + $match.Matches.Groups[3].Value
                $submoduleLogs = Get-GitLogs $newhash ($repositoryPath + "/" + $files) | ForEach-Object {
                    "|    $_"
                }
                $log += (" └── $files" + "`n " + $submoduleLogs)
            }
        }
    }

    return $log
}

function Get-SubmoduleList
{
    param(
        [string]$repositoryPath = (Get-Location)
    )
    # Parsing the .gitmodules is way faster than running the `git submodule` command
    $gitModulesPath = Join-Path $repositoryPath ".gitmodules"
    if ((Test-Path $gitModulesPath) -and ($gitModulesCache.ContainsKey($repositoryPath) -eq $false))
    {
        $gitModulesContent = Get-Content $gitModulesPath -Raw
        $activeSubmodules = $null
        $activeSubmodules = [regex]::Matches($gitModulesContent, '^\s*\[submodule "(.+)"\]', 'MultiLine') | ForEach-Object {
                                $_.Groups[1].Value
                            }
        $gitModulesCache[$repositoryPath] = $activeSubmodules
    }
}

function Get-GitLogs
{
    param(
        [string]$hash,
        [string]$repositoryPath = (Get-Location)
    )

    if (-not (Test-Path (Join-Path $repositoryPath ".git")))
    {
        Write-Host "Error: '$repositoryPath' is not a git repository!"
        return
    }

    Get-SubmoduleList($repositoryPath)

    $logs = @()
    foreach ($parentLog in (git ("--git-dir=" + $repositoryPath + "/.git") log $hash --format="%h %s"))
    {
        $subHash, $message = $parentLog.Split(" ", 2)
        $logs += $parentLog + "`r"
        if ($gitModulesCache[$repositoryPath].Count -ne 0) { $logs += Get-ChildLog $subHash $repositoryPath }
    }

    return $logs
}

Write-Host (Get-GitLogs "$prevRevision..HEAD")

Formats as follows:

c193a7a2d Commit 1
 └── Submodule A
 |    0e658567 Commit A1
 |     └── Submodule B
 |    b6649d4b7 Commit B1
10368440c Commit 2
 └── Submodule C
 |    4cc533c Commit C1

The script must be run at the same location where .git folder is.

Upvotes: 0

iblue
iblue

Reputation: 30424

You can display the submodule changes, but only when using git log -p. The following command shows the full diff of each commit and submodule changes.

git log -p --submodule=log

Submodule commit messages will be listed like this:

Submodule <submodule-name> <starting-commit>..<ending-commit>:
> Commit message 1
> Commit message 2
...
> Commit message n

If you are not interested in reading the full diff of each commit, you can match and filter out those parts:

git log -p --submodule=log | awk '
/^commit/ { add=1 } # Start of commit message
/^diff --git/ { add=0 } # Start of diff snippet
{ if (add) { buf = buf "\n" $0 } } # Add lines if part of commit message
END { print buf }
'

Upvotes: 21

famoses
famoses

Reputation: 21

If you are using bash you can use the following script to show submodule commit log embedded to superproject log.

#!/bin/bash 

# regular expressions related to git log output
# when using options -U0 and --submodule=log
kREGEXP_ADD_SUBMODLE='0+\.\.\.[0-9a-f]+'
kREGEXP_REM_SUBMODLE='[0-9a-f]+\.\.\.0+'

# --------------------------------------------------------------------
# function submodule_log
# --------------------------------------------------------------------
# 
# print a log of submodule changes for a range of commits
#
# arguments : see start of function body for details  
# 
function submodule_log {

    sm_present=$1; # presence 0: no, 1: yes
    sm_status=$2   # status   0: as is, 1: added submodule, 2: removed submodule 
    sm_name=$3     # name
    sm_id_base=$4  # base commit id added changes
    sm_id_now=$5   # final commit id added changes

    cur_dir=`pwd`

    # commits cannot be accessed if sbumodule working tree was removed, 
    # show submodule commits in details only if directory exists
    #
    # note: As of git 1.9, in .git/modules/<submodule-name>
    #       still the entire gitdir is present, just git won't successfully
    #       run something like 'git --git-dir .git/modules/<submodule-name> log f374fbf^!'
    #       from the superproject root dir. It fails as it want's to change directory to
    #       to the submodule working tree at '../../../<submodule-name>' to get the log.
    #       If one just creates it as an empty directory the command succeeds, but
    #       we cannot force the user to leave an empty directory. So just a hint
    #       is output to suggest creation of directory to get full log.

    #echo " $submod_entry"

    if [ -e $sm_name ]  
    then    
        cd $sm_name

        # if submodule not present in current version of superproject
        # can retrieve git log info only by using option '--git-dir'
        # -> use always option --git-dir

        git_dir_opt="--git-dir $cur_dir/.git/modules/$sm_name"
        git_cmd_base="git $git_dir_opt log --format=\"  %Cred%h %s%Creset\""

        if [ $sm_status -eq 0 ]
        then
            # modified module: output info on added commit(s)
            eval "$git_cmd_base ${sm_id_base}..${sm_id_now}"
        fi

        if [ $sm_status -eq 1 ]
        then
            # new module: output only info on base commit    
            eval "$git_cmd_base ${sm_id_now}^!"
        fi

        if [ $sm_status -eq 2 ]
        then
            # removed module: output only info on last commit  
            eval "$git_cmd_base ${sm_id_base}^!"
        fi

        cd $cur_dir 
    else
        echo " Skip info on submodule $sm_name (not present in current working tree)"
        echo " For full log, please add empty directory $sm_name for full log."
    fi 
}

# --------------------------------------------------------------------
# main script 
# --------------------------------------------------------------------

# Get the log of the parent repository (only SHA1 and parent's SHA1), 
# use files as amount of data might be huge in older repos 

# get commit ids as array
readarray -t log_commitids < <(git log --format="%H")

# get commit ids of parent commits 
readarray -t log_parents < <(git log --format="%P")

for ((c_idx=0; $c_idx<${#log_commitids[@]}; c_idx=$c_idx+1))
do
    # Can only be one commit id, but remove trailing newline and linefeed
    commit="${log_commitids[$c_idx]//[$'\r\n']}"

    # Can be more than one parent if it's a merge
    # remove trailing newline and linefeed
    parents="${log_parents[$c_idx]//[$'\r\n']}"    
    parents_a=($(echo $parents))
    num_parents=${#parents_a[@]}

    # check if merge commit, prefix next commit with M as they are merge
    merge_prefix=""
    if [ $num_parents -ge 2 ] 
    then
        merge_prefix="M$num_parents" 
    fi

    # Print the two-line summary for this commit
    git log --format="%Cgreen%h (%cI %cN)%Creset%n %Cgreen$merge_prefix%Creset %s" $commit^!

    #echo "found $num_parents parents"

    if [ "$parents" = "" ]
    then
       unset parents
    else

        for parent in $parents
        do
            # Find entires like 
            #  "Submodule libA 0000000...f374fbf (new submodule)"      or
            #  "Submodule libA e51c470...0000000 (submodule deleted)"  or 
            #  "Submodule libA f374fbf..af648b2e:"
            # in supermodules history in order to determine submodule's
            # name and commit range describing the changes that 
            # were added to the supermodule. Two regular expressions
            # kREGEXP_ADD_SUBMODLE and kREGEXP_REM_SUBMODLE are used
            # to find added and removed submodules respectively.

            readarray -t submod < <(git log -U0 --submodule=log ${parent}..${commit} \
            | grep -U -P '^Submodule \S+ [0-9a-f]+')

            for ((s_idx=0; $s_idx<${#submod[@]}; s_idx=$s_idx+1))
            do
                # remove trailing newline and linefeed
                submod_entry="${submod[$s_idx]//[$'\r\n']}"

                #echo mainly unfiltered as to show submod name and its
                #commit range stored in repo's log
                echo " $submod_entry"

                # remove preceding info 'Submodule ' as we already know that :-)
                submod_entry="${submod_entry/Submodule }"

                # if viewing repository version for which submodules do not exist
                # they are reported with correct commit ids but trailing text
                # is different, first assume it is present then check submod_entry
                submod_present=1
                if [[ "$submod_entry" =~ "commits not present" ]]
                then
                   submod_present=0

                   # remove trailing info about deleted submodule, if any
                   submod_entry="${submod_entry/'(commits not present)'}"                 
                fi

                # find with submodule got added/modified/removed by this superproject commit
                # assume 'modified' submodule, then check if commit range indicates
                # special cases like added/removed submodule
                sub_status=0                
                if [[ "$submod_entry" =~ $kREGEXP_ADD_SUBMODLE ]]
                then
                   sub_status=1

                   # remove trailing info about new submodule, if any
                   submod_entry="${submod_entry/'(new submodule)'}" 
                fi

                if [[ "$submod_entry" =~ $kREGEXP_REM_SUBMODLE ]]
                then
                   sub_status=2

                   # remove trailing info about deleted submodule, if any
                   submod_entry="${submod_entry/'(submodule deleted)'}"
                fi

                # create log output for submod_entry 
                # - pass contents in submod_entry as separate arguments
                #   by expanding variable and using eval to execute resulting code

                #replace dots by spaces as to split apart source and destination commit id
                submod_entry="${submod_entry//./ }"
                #remove colon behind last commit id, if any
                submod_entry="${submod_entry//:/}"

                eval "submodule_log $submod_present $sub_status $submod_entry"
            done    
        done
    fi
done

The script is similar to the PowerShell script listed above but resolves some issues and outputs in a more dense format. It can handle new submodules and removed submodules.

To properly show log information for submodules that aren't part of the superproject anymore (removed submodule), at least the submodule root directory (can be empty) has to remain in the repository. Otherwise Git (tested with version 2.19.0 on Windows) would fail in the log command (for example in git --git-dir ./.git/modules/libA log --oneline f374fbf^!) as it always changes working directory to the submodule root directory (for whatever reason).

Upvotes: 2

Vynce
Vynce

Reputation: 421

Here's a simple bash command that creates an ASCII commit graph (similar to gitk) that interleaves the relevant submodule commits when a submodule gets changed in the superproject. It prints out the full patch for every commit and then uses grep to filter out the patch contents leaving only the summary lines and submodule changes.

git log --graph --oneline -U0 --submodule Tag1..Tag2 | grep -E '^[*| /\\]+([0-9a-f]{7} |Submodule |> |$)'

It produces output similar to this:

* 854407e Update submodule
| Submodule SUB 8ebf7c8..521fc49:
|   > Commit C
* 99df57c Commit B
* 79e4075 Commit A

Upvotes: 22

Thomas Levesque
Thomas Levesque

Reputation: 292465

If you're working on Windows, you can use this PowerShell script:

function Parse-SubmoduleDiff($rawDiffLines) {
    $prefix = "Subproject commit "
    $oldCommitLine = $($rawDiffLines | where { $_.StartsWith("-" + $prefix) } | select -First 1)
    $newCommitLine = $($rawDiffLines | where { $_.StartsWith("+" + $prefix) } | select -First 1)

    if ($newCommitLine -eq $null) {
        return $null
    }

    $oldCommit = $null
    if ($oldCommitLine -ne $null) {
        $oldCommit = $oldCommitLine.Substring($prefix.Length + 1)
    }
    $newCommit = $newCommitLine.Substring($prefix.Length + 1)
    return @{ OldCommit = $oldCommit; NewCommit = $newCommit }
}

# Get the paths of all submodules
$submodulePaths = $(git submodule foreach --quiet 'echo $path')
if ($submodulePaths -eq $null) {
    $submodulePaths = @()
}

# Get the log of the parent repository (only SHA1)
$log = $(git log --format="%H %P" $args)
foreach ($line in $log) {

    $parts = $line.Split()
    $commit = $parts[0]
    $parents = $parts[1..$parts.Length]

    # Print the summary for this commit
    git show --format=medium --no-patch $commit
    echo ""

    # Can be more than one parent if it's a merge
    foreach ($parent in $parents) {
        # List the paths that changed in this commit
        $changes = $(git diff --name-only $parent $commit)

        if ([System.String]::IsNullOrWhiteSpace($parent)) {
            continue;
        }

        foreach ($path in $changes) {

            if ($submodulePaths.Contains($path)) {
                # if it's a submodule, the diff should look like this:
                # -Subproject commit 1486adc5c0c37ad3fa2f2e373e125f4000e4235f
                # +Subproject commit a208e767afd0a51c961654d3693893bbb4605902
                # from that we can extract the old and new submodule reference

                $subDiff = $(git diff $parent $commit -- $path)
                $parsed = Parse-SubmoduleDiff($subDiff)
                if ($parsed -eq $null) {
                    continue;
                }

                # Now get the log between the old and new submodule commit
                $oldCommit = $parsed.OldCommit
                $newCommit = $parsed.NewCommit
                echo "Submodule '$path'"
                if ($oldCommit -ne $null) {
                    $range = $($oldCommit + ".." + $newCommit)
                } else {
                    $range = $newCommit
                }

                git --git-dir $path/.git log $range | foreach { "  |  " + $_ }
                echo ""
            }
        }
    }
}

Obviously it can be translated to bash for use on Linux. The general principle is this:

for each commit in the parent repo
    print the commit summary
    for each submodule that has changed in this commit
        get the old and new commit hashes of the submodule
        print the log of the submodule between those commits
    end for
end for

Upvotes: 1

Related Questions