Reputation: 4021
Is there a way to configure GitVersion to use shortened (say, 6 character long) hashes for its version numbering?
I.e;
1.2.3-unstable645 Branch:'develop' Sha:'a682956dccae752aa24597a0f5cd939f93614509'
Becomes
1.2.3-unstable645 Branch:'develop' Sha:'a68295'
Entropy should mean that the additional characters (past, say 6, with 1.6^10*7
permutations) provide no significant identification given but makes the version a little shorter if it needs to be displayed.
Upvotes: 4
Views: 2847
Reputation: 6659
Entropy should mean that the additional characters (past, say 6, with 1.6^10*7 permutations) provide no significant identification given but makes the version a little shorter if it needs to be displayed
Not sure about your math, but it is common for developers to use just the first six, eight or twelve digits of the Git hash as an identifier. For any given smallish repo, a collision would be fairly unlikely, but definitely possible. I have seen tooling that uses the shorter version for display purposes only, but internally, they use the full 40 character hash value.
If you are constructing a SemVer string, you could embed the branch name and commit hash in either the shortened or full form:
1.2.3-unstable645+develop.a68295
My preference has always been to use the full hash:
1.2.3-unstable645+develop.a682956dccae752aa24597a0f5cd939f93614509
I have also seen schemes that use the first six and last six digits:
1.2.3-unstable645+develop.a68295-614509
Getting back to your math... SHA-1 emits a 160 bit (20 bytes) hash that requires 40 bytes for full display in hex. The algorithm is reasonably good at distributing bit changes across all 160 bits for each change in input. Using only 6 characters from the hash means you're only getting 3 bytes (24 bits) worth of hash, so:
2^160 ~= 1.461502e+48
2^24 ~= 1.677722e+7
That's a rather large increase in probability of collision.
How many digits do you really need? It turns out that number depends on the commit history in your repository. Git commands have a feature the allow you to match the shortest matching unique prefix, rather than having to specify the entire hash. With a single commit in your history, then it could be a single digit, but commits pile up, that number invariably increases. Some high activity repo's (linux kernel for instance) require at least 11 digits to uniquely identify every commit that they contain, but that number will continue to increase over time.
The primary take-out on all this is that there are repositories out there that will have hash collisions in the first N digits of their hash, where N < 12. Some of them containing thousands of collisions in the first six digits! Your mileage will vary.
Upvotes: 1
Reputation: 18951
GitVersion emits lots of different parts of the asserted version number, each of which could be used to form the required version number, however, a shortened sha is not one of them. Here are all the variables that are currently asserted:
{
"Major":0,
"Minor":21,
"Patch":0,
"PreReleaseTag":"",
"PreReleaseTagWithDash":"",
"PreReleaseLabel":"",
"PreReleaseNumber":"",
"BuildMetaData":"",
"BuildMetaDataPadded":"",
"FullBuildMetaData":"Branch.hotfix/0.21.1.Sha.57e16a787815c5e27c3a0edbf5224b3df64f1a69",
"MajorMinorPatch":"0.21.0",
"SemVer":"0.21.0",
"LegacySemVer":"0.21.0",
"LegacySemVerPadded":"0.21.0",
"AssemblySemVer":"0.21.0.0",
"FullSemVer":"0.21.0",
"InformationalVersion":"0.21.0+Branch.hotfix/0.21.1.Sha.57e16a787815c5e27c3a0edbf5224b3df64f1a69",
"BranchName":"hotfix/0.21.1",
"Sha":"57e16a787815c5e27c3a0edbf5224b3df64f1a69",
"NuGetVersionV2":"0.21.0",
"NuGetVersion":"0.21.0",
"CommitsSinceVersionSource":0,
"CommitsSinceVersionSourcePadded":"0000",
"CommitDate":"2017-07-14"
}
Assuming you are using some form of build script, you could manually shortened the asserted sha, and then combine it with the other required variables to get the desired version number.
Upvotes: 0