Reputation: 7030
In git
, I see that when using custom formatting for git log
, there are "reflog identity" values possible. What is a reflog identity?
Upvotes: 3
Views: 3138
Reputation: 1323963
Note the git reflog
identity/entry is about to change (2020, 4 years later)
With Git 2.29 (Q4 2020), Preliminary clean-up of the refs API in preparation for adding a new refs backend "reftable
".
See commit 523fa69 (10 Jul 2020) by Junio C Hamano (gitster
).
See commit de966e3, commit ce57d85 (10 Jul 2020), and commit 9e35a6a (30 Jun 2020) by Han-Wen Nienhuys (hanwen
).
(Merged by Junio C Hamano -- gitster
-- in commit 3161cc6, 30 Jul 2020)
reflog
: cleanse messages in therefs.c
layerSigned-off-by: Han-Wen Nienhuys
Regarding reflog messages:
- We expect that a reflog message consists of a single line.
The file format used by the files backend may add aLF
after the message as a delimiter, and output by commands like "git log -g
"(man) may complete such an incomplete line by adding aLF
at the end, but philosophically, the terminatingLF
is not a part of the message.- We however allow callers of refs API to supply a random sequence of
NUL
terminated bytes.
We cleanse caller-supplied message by squashing a run of whitespaces into aSP
, and by trimming trailing whitespace, before storing the message.
This is how we tolerate, instead of erring out, a message withLF
in it (be it at the end, in the middle, or both).Currently, the cleansing of the reflog message is done by the files backend, before the log is written out.
This is sufficient with the current code, as that is the only backend that writes reflogs.
But new backends can be added that write reflogs, and we'd want the resulting log message we would read out of "log -g
" the same no matter what backend is used, and moving the code to do so to the generic layer is a way to do so.An added benefit is that the "cleansing" function could be updated later, independent from individual backends, to e.g. allow multi-line log messages if we wanted to, and when that happens, it would help a lot to ensure we covered all bases if the cleansing function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog messages right at the moment (nobody is asking for it), but I envision that instead of the "squash a run of whitespaces into a SP and rtrim" cleansing, we can
%urlencode
problematic bytes in the message AND append aSP
at the end, when a new version of Git that supports multi-line and/or verbatim reflog messages writes a reflog record.
The reading side can detect the presence ofSP
at the end (which should have beenrtrimmed
out if it were written by existing versions of Git) as a signal that decoding%urlencode
recovers the original reflog message.
With Git 2.29 (Q4 2020), git reflog
manages the \t
for its entries.
See commit 25429fe (31 Jul 2020) by Han-Wen Nienhuys (hanwen
).
(Merged by Junio C Hamano -- gitster
-- in commit dc3c6fb, 01 Aug 2020)
refs
: move the logic to add\t
toreflog
to the files backendSigned-off-by: Han-Wen Nienhuys
523fa69c ("
reflog
: cleanse messages in therefs.c
layer", 2020-07-10, Git v2.29.0 -- merge) centralized reflog normalizaton.
However, the normalizaton added a leading "\t
" to the message.
This is an artifact of the reflog storage format in the files backend, so it should be added there.Routines that parse back the reflog (such as
grab_nth_branch_switch
) expect the "\t
" to not be in the message, so without this fix,git reflog
(man) withreftable
cannot process the "@{-1}
" syntax.
Git 2.46 (Q3 2024), batch 9 adds more detail about that new reftable
backend where reflog identity is written: the knobs to tweak how reftable files are written have been made available as configuration variables.
See commit f518d91, commit f663d34, commit afbdbfa, commit 90db611, commit 8e9e136, commit 831b366, commit fcf3418, commit c22d75b, commit e0cf3d8, commit 7992378, commit 4d35bb2 (13 May 2024) by Patrick Steinhardt (pks-t
).
(Merged by Junio C Hamano -- gitster
-- in commit 23528d3, 30 May 2024)
refs/reftable
: allow disabling writing the object indexSigned-off-by: Patrick Steinhardt
Besides the expected "
ref
" and "log
" records, the reftable library also writes "obj
" records.
These are basically a reverse mapping of object IDs to their respective ref records so that it becomes efficient to figure out which references point to a specific object.
The motivation for this data structure is the "uploadpack.allowTipSHA1InWant
" config, which allows a client to fetch any object by its hash that has a ref pointing to it.This reverse index is not used by Git at all though, and the expectation is that most hosters nowadays use "
uploadpack.allowAnySHA1InWant
".
It may thus be preferable for many users to disable writing these optional object indices altogether to safe some precious disk space.Add a new config "
reftable.indexObjects
" that allows the user to disable the object index altogether.
git config
now includes in its man page:
reftable.blockSize
The size in bytes used by the reftable backend when writing blocks.
The block size is determined by the writer, and does not have to be a power of 2. The block size must be larger than the longest reference name or log entry used in the repository, as references cannot span blocks.Powers of two that are friendly to the virtual memory system or filesystem (such as 4kB or 8kB) are recommended. Larger sizes (64kB) can yield better compression, with a possible increased cost incurred by readers during access.
The largest block size is
16777215
bytes (15.99 MiB).
The default value is4096
bytes (4kB). A value of0
will use the default value.
reftable.restartInterval
The interval at which to create restart points. The reftable backend determines the restart points at file creation. Every 16 may be more suitable for smaller block sizes (4k or 8k), every 64 for larger block sizes (64k).
More frequent restart points reduces prefix compression and increases space consumed by the restart table, both of which increase file size.
Less frequent restart points makes prefix compression more effective, decreasing overall file size, with increased penalties for readers walking through more records after the binary search step.
A maximum of
65535
restart points per block is supported.The default value is to create restart points every 16 records. A value of
0
will use the default value.
reftable.indexObjects
Whether the reftable backend shall write object blocks. Object blocks are a reverse mapping of object ID to the references pointing to them.
The default value istrue
.
reftable.geometricFactor
Whenever the reftable backend appends a new table to the stack, it performs auto compaction to ensure that there is only a handful of tables.
The backend does this by ensuring that tables form a geometric sequence regarding the respective sizes of each table.By default, the geometric sequence uses a factor of 2, meaning that for any table, the next-biggest table must at least be twice as big. A maximum factor of 256 is supported.
Upvotes: 0
Reputation: 488103
These refer to reflog entries. A reflog is simply a record of updates to a reference, and a reference itself is simply a generalization of branch and tag names and special names like HEAD
.
Reflogs are normally enabled on client repositories (like yours) and normally disabled on server repositories. This is, naturally enough, configurable. The front end command people mostly use for looking at their reflogs is git reflog
. You can run that now if you like, but doing so won't help explain %ge
and so on. So we'll do something different: Run git log -g
.
Running git reflog
basically runs git log --oneline -g
. By running git log -g
yourself, you can leave out the --oneline
, and hence see more than one line for each reflog entry.
The output will resemble the following, with names and email addresses changed:
commit 08b876daae9944d1a6fba271cfcd9629c13dfd69
Reflog: HEAD@{0} (A U Thor <[email protected]>)
Reflog message: commit: initial torturetest code
Author: A U Thor <[email protected]>
Date: Sun Aug 7 01:59:31 2016 -0700
initial torturetest code
commit 8bb118938b5c6a2978f13e74525b594a48226571
Reflog: HEAD@{1} (A U Thor <[email protected]>)
Reflog message: checkout: moving from master to torturetest
Author: Someone Else <[email protected]>
Date: Sat Jul 16 02:00:46 2016 +0200
Allow backend ...
The most recent commit is one I made last night (well, this morning). This is HEAD@{0}
. It represents some commit (whose true name is the big ugly SHA-1 hash starting with 08b87...
). The commit itself has an author (me, though I changed the name here for display purposes), date, commit message, and so on—but the reflog entry, HEAD@{0}
, also has an author (me again), date, and message.
In this case, the commit's author and the reflog author are the same. Even the reflog message is basically the same as the commit subject (the Reflog message:
line just as the word commit:
inserted). So that's not much help—but take a look at the very next example, commit 8bb11...
.
This reflog entry has me as the reflog author, and someone else as the commit author.1 Moreover, the reflog message, checkout: moving from master to torturetest
, is completely unrelated to the commit's subject line, which begins with Allow backend
.
If you compare this to the short output from git log -g --oneline
or git reflog
—both of these examine the reflog for HEAD
—you'll see only the reflog message, along with the commit ID and the reflog selector.
One other thing is worth noting here. In regular git log
output, each commit normally2 appears only once. In git log -g
output, however, a commit can appear repeatedly, because Git is looking at the hash IDs stored in the reflog itself. If you switch back and forth between branches that point to the same commit, or use git reset
to change a branch to point back to a commit it pointed-to earlier, or run git rebase
, or do any number of similar things, you can easily get a reference—this applies to both HEAD
and branch names—that points to the same commit in multiple different reflog entries.
In my case, for instance, I apparently vacillated a bit on the name torturetest
or something:
08b876d HEAD@{0}: commit: initial torturetest code
8bb1189 HEAD@{1}: checkout: moving from master to torturetest
8bb1189 HEAD@{2}: checkout: moving from torturetest to master
(I'm not really sure what this was about—perhaps just running too many Git commands without remembering which repository I was in. :-) )
Returning directly to your question:
What is a reflog identity?
These are the names and email addresses stored in each reflog entry. In the case of a private Git repository, on your own client, these are likely to all be the same all the time. But since you can run git config --global user.name "New User Name"
and git config --global user.email new@address
any time to change them,3 they could vary.
1That someone else is also the committer, if you get to wondering. The commit's author and committer, and their corresponding dates and email addresses, are stored in the commit itself. The reflog author, date, and email address are stored in the reflog entry. It's actually a plain text file today, so you can just look at .git/logs/HEAD
and .git/logs/refs/heads/master
to see the raw reflog data. The format is not particularly well documented, but is pretty obvious: it has the old and new values for the reference; the reflog's author, email, and date-stamp information; and the reflog message.
2The exception here, beside the one for reflogs themselves of course, occurs when using git log -m -p
to split merge commits. Normally git log
skips merge commits entirely, while git show
shows combined diffs for them. (The documentation on combined diffs is somewhat buried—search here on StackOverflow for the term "combined diffs".)
If you convince git log
to show a diff, it too can show a combined diff. In all cases, combined diffs may omit crucial information, so you can tell these commands to do something different: for each parent of a merge, produce a diff of the merge commit's tree against that particular parent's tree. This is what the -m
flag does.
When showing a diff of commit merge commit 1234567...
against parent #1, Git shows you the merge commit information, then the diff. Then, when showing a diff of merge commit 1234567...
against parent #2, Git shows you the merge commit information again, before the second diff. So this is how git log
can show the same commit more than once.
3You can also use git -c user.name=whatever
and git -c user.email=whatever
, or in this particular case, special Git environment variables. Using git -c
is especially convenient for one-off tests, as in the answer I wrote recently about Git diff color options.
Upvotes: 4
Reputation: 142064
git reflog
is another command.
Everytime the HEAD
is changed, git store its old value in its .git/log
folder and you can view it vi the git reflog command.
The meanning of the "reflog identity" is simply:
Each commit will be grouped by author and title
Upvotes: 0