Reputation: 42984
In some code I use the Win32 RegGetValue()
API to read a string from the registry.
I call the aforementioned API twice:
The purpose of the first call is to get the proper size to allocate a destination buffer for the string.
The second call reads the string from the registry into that buffer.
What is odd is that I found that RegGetValue()
returns different size values between the two calls.
In particular, the size value returned in the second call is two bytes (equivalent to one wchar_t
) less than the first call.
It's worth noting that the size value compatible with the actual string length is the value returned by the second call (this corresponds to the actual string length, including the terminating NUL
).
But I don't understand why the first call returns a size two bytes (one wchar_t
) bigger than that.
A screenshot of program output and Win32 C++ compilable repro code are attached.
Repro Source Code
#include <windows.h>
#include <iostream>
#include <string>
#include <vector>
using namespace std;
void PrintSize(const char* const message, const DWORD sizeBytes)
{
cout << message << ": " << sizeBytes << " bytes ("
<< (sizeBytes/sizeof(wchar_t)) << " wchar_t's)\n";
}
int main()
{
const HKEY key = HKEY_LOCAL_MACHINE;
const wchar_t* const subKey = L"SOFTWARE\\Microsoft\\Windows\\CurrentVersion";
const wchar_t* const valueName = L"CommonFilesDir";
//
// Get string size
//
DWORD keyType = 0;
DWORD dataSize = 0;
const DWORD flags = RRF_RT_REG_SZ;
LONG result = ::RegGetValue(
key,
subKey,
valueName,
flags,
&keyType,
nullptr,
&dataSize);
if (result != ERROR_SUCCESS)
{
cout << "Error: " << result << '\n';
return 1;
}
PrintSize("1st call size", dataSize);
const DWORD dataSize1 = dataSize; // store for later use
//
// Allocate buffer and read string into it
//
vector<wchar_t> buffer(dataSize / sizeof(wchar_t));
result = ::RegGetValue(
key,
subKey,
valueName,
flags,
nullptr,
&buffer[0],
&dataSize);
if (result != ERROR_SUCCESS)
{
cout << "Error: " << result << '\n';
return 1;
}
PrintSize("2nd call size", dataSize);
const wstring text(buffer.data());
cout << "Read string:\n";
wcout << text << '\n';
wcout << wstring(dataSize/sizeof(wchar_t), L'*') << " <-- 2nd call size\n";
wcout << wstring(dataSize1/sizeof(wchar_t), L'-') << " <-- 1st call size\n";
}
Operating System: Windows 7 64-bit with SP1
EDIT
Some confusion seems to be arisen by the particular registry key I happened to read in the sample repro code.
So, let me clarify that I read that key from the registry just as a test. This is not production code, and I'm not interested in that particular key. Feel free to add a simple test key to the registry with some test string value.
Sorry for the confusion.
Upvotes: 6
Views: 2394
Reputation: 42984
This blog post (published on February 14th, 2024) clarifies the issue:
There are a number of functions in Windows that are part of a three-phase operation:
- Request the size of a buffer needed to receive some data.
- Allocate a buffer of that size.
- Call the function again with that buffer.
When you ask for the required size of a buffer, it is not uncommon for the function to return a value that larger than the actual value you get from step 3, when you ask for the data to be placed in the buffer.
[…] Given that the caller has to be prepared for the size to change anyway, the “how big of a buffer do I need” call can return an over-estimate of the required size, since that will allow the second call for the data to succeed (assuming the data hasn’t changed). And giving an over-estimate is often much easier than giving an exact value.
I think the official MSDN documentation should be updated with that information.
Upvotes: 0
Reputation: 597061
RegGetValue()
is safer than RegQueryValueEx()
because it artificially adds a null terminator to the output of a string value if it does not already have a null terminator.
The first call returns the data size plus room for an extra null terminator in case the actual data is not already null terminated. I suspect RegGetValue()
does not look at the real data at this stage, it just does an unconditional data size + sizeof(wchar_t)
to be safe.
(36 * sizeof(wchar_t)) + (1 * sizeof(wchar_t)) = 74
The second call returns the real size of the actual data that was read. That size would include the extra null terminator only if one had to be artificially added. In this case, your data has 35 characters in the path, and a real null terminator present (which well-behaved apps are supposed to do), thus the extra null terminator did not need to be added.
((35+1) * sizeof(wchar_t)) + (0 * sizeof(wchar_t)) = 72
Now, with that said, you really should not be reading from the Registry directly to get the CommonFilesDir
path (or any other system path) in the first place. You should be using SHGetFolderPath(CSIDL_PROGRAM_FILES_COMMON)
or SHGetKnownFolderPath(FOLDERID_ProgramFilesCommon)
instead. Let the Shell deal with the Registry for you. This is consistent across Windows versions, as Registry settings are subject to be moved around from one version to another, as well as accounting for per-user paths vs system-global paths. These are the main reasons why the CSIDL
API was introduced in the first place.
Upvotes: 11