Reputation: 6873
I am Using PowerShell to gather some data from the Uninstall key of the registry and write to XML, and everything works right up until what needs to be written includes some simplified Chinese characters. When I look at the registry itself, the value of DisplayName is
Object Enabler for AutoCAD Plant 3D 2023 - 简体中文 (Simplified Chinese)
But when I use
Write-Host "$($uninstallKey.GetValue('DisplayName'))"
I get
Object Enabler for AutoCAD Plant 3D 2023 - 简体中文 (Simplified Chinese) EE- 88}
Not sure where that EE- 88}
is coming from, and what else might be hiding there. At first, I thought my issue was with the encoding of the XML file at write. I had been using [System.Text.UTF8Encoding]
which throws an error
Exception calling "Save" with "1" argument(s): "'.', hexadecimal value 0x00, is an invalid character."
But now I think the problem is elsewhere, since a Write-Host shows something different from what I see in the registry itself.
I am using
$localMachineHive = [Microsoft.Win32.RegistryKey]::OpenBaseKey([Microsoft.Win32.RegistryHive]::LocalMachine, 0)
$uninstallKey = $localMachineHive.OpenSubKey("$uninstallKeyPath\$uninstallKeyName")
to access the registry, where "$uninstallKeyPath\$uninstallKeyName"
defines the key path (x64 or x32) to the individual key. I recently moved to this approach because it is much faster than PS native registry access. But perhaps there is an encoding nuance there that I am missing? Or is this a place where Write-Host
is the problem?
EDIT: Verified the mechanism for accessing the registry isn't the issue. These both produce the same output, that doesn't match what I see in RegEdit.
$localMachineHive = [Microsoft.Win32.RegistryKey]::OpenBaseKey([Microsoft.Win32.RegistryHive]::LocalMachine, 0)
$uninstallKey = $localMachineHive.OpenSubKey("SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\{BF3F377C-AF47-33EE-979F-67D4EFA9FAB0}")
Write-Host "$($uninstallKey.GetValue('DisplayName'))"
$displayName = Get-ItemPropertyValue -Path 'Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\{BF3F377C-AF47-33EE-979F-67D4EFA9FAB0}' -Name DisplayName
Write-Host "$displayName"
Upvotes: 2
Views: 335
Reputation: 27786
This looks like an error of the data stored in the registry, probably a mismatch between the actual string length and the number of bytes passed per the cbData
parameter of RegSetValueEx()
(the native API for writing registry values).
If the program that wrote the registry value passed an argument for cbData
that is too large, then it could actually store data beyond the actual string data in the registry (whatever happens to be in memory after the intended data, which could be just random "garbage" and worst case confidential data like passwords).
When PowerShell reads the registry value, it gets the null terminator character and any additional characters, which might appear as random characters. Note that RegEdit doesn't show these characters.
Remove all characters starting from the null terminator character up to the end of the string:
# Using a RegEx to remove the first null character and any following characters
$displayName -replace '\0.*'
# alternatively:
($displayName -split ([char] 0), 2)[0]
Trying to actually reproduce the problem, I've created a bogus C++ console application:
#include <windows.h>
struct Test {
wchar_t const user[7] = L"MyUser";
wchar_t const password[6] = L"MyPwd";
};
int main()
{
// Create or open a registry key
HKEY regKey = nullptr;
::RegCreateKeyExW( HKEY_CURRENT_USER, L"_TestKey", 0, nullptr, 0, KEY_READ | KEY_WRITE, nullptr, ®Key, nullptr );
// Attempt to write the string member data.value, but pass a value for cbData
// that is twice the number of actual bytes
Test data;
::RegSetValueExW( regKey, L"FooBar", 0, REG_SZ, reinterpret_cast<BYTE const*>( &data.user ), sizeof( data.user ) * 2 );
}
By passing twice the actual number of bytes for cbData
, the code unintentionally writes the value of the password
member after the intended value of the user
member into the registry, separated by a null character.
PowerShell code that reads the value:
$hkcu = [Microsoft.Win32.RegistryKey]::OpenBaseKey([Microsoft.Win32.RegistryHive]::CurrentUser, 0)
$regkey = $hkcu.OpenSubKey('_TestKey')
$regkey.GetValue('FooBar')
Output:
MyUserMyPwd
Note that PowerShell strips the null terminator between "MyUser" and "MyPwd" from the output, but if you read the registry value into a variable, the null terminator will be there.
Out of curiosity, I wrote a script that lists all registry string values that contain embedded null characters (excluding REG_MULTI_SZ
values, which may contain embedded null characters by design).
Example:
.\Get-RegStringsWithEmbeddedNull.ps1 -Hive LocalMachine -View Registry64 -EA Ignore
.\Get-RegStringsWithEmbeddedNull.ps1 -Hive LocalMachine -View Registry32 -EA Ignore
On my machine, this lists over 500 values! In many cases the difference in length between the stored string and the actual string (trimmed using -replace '\0.*'
) is only 1 character (so only an extra null is stored), which makes it especially hard for the unsuspecting developer to diagnose problems when working with such values, because PowerShell doesn't display embedded null characters. The only way to diagnose these off-by-1 errors is by looking at the Length
property of the string.
Conclusion:
In general it is a good idea to trim any registry string value of type REG_SZ
and REG_EXPAND_SZ
at the first null character. There might be cases where embedded null characters are actually intended, but these are rare and against the spec (developer should have choosen REG_MULTI_SZ
instead). Most cases seem to be caused by programmer errors. The C APIs are easy to use incorrectly, as some expect you to pass the character count, others expect that you include the null terminator and others require you to pass the buffer size (in characters or even in bytes), which might be larger than the actual string length.
Upvotes: 2