Reputation: 10507
I'm trying to use modern string-handling approaches (like std::string_view
or GSL's string_span
) to interact with a C API (DBus) that takes strings as null-terminated const char*
s, e.g.
DBusMessage* dbus_message_new_method_call(
const char* destination,
const char* path,
const char* iface,
const char* method
)
string_view
and string_span
don't guarantee that their contents are null-terminated - since spans are (char* start, ptrdiff_t length)
pairs, that's largely the point. But GSL also provides a zstring_view
, which is guaranteed to be null-terminated. The comments around zstring_span
suggest that it's designed exactly for working with legacy and C APIs, but I ran into several sticking points as soon as I started using it:
Representing a string literal as a string_span
is trivial:
cstring_span<> bar = "easy peasy";
but representing one as a zstring_span
requires you to wrap the literal in a helper function:
czstring_span<> foo = ensure_z("odd");
This makes declarations noisier, and it also seems odd that a literal (which is guaranteed to be null-terminated) isn't implicitly convertible to a zstring_span
. ensure_z()
also isn't constexpr
, unlike constructors and conversions for string_span
.
There's a similar oddity with std::string
, which is implicitly convertible to string_span
, but not zstring_span
, even though std::string::data()
has been guaranteed to return a null-terminated sequence since C++11. Again, you have to call ensure_z()
:
zstring_span<> to_zspan(std::string& s) { return ensure_z(s); }
There seems to be some const-correctness issues. The above works, but
czstring_span<> to_czspan(const std::string& s) { return ensure_z(s); }
fails to compile, with errors about being unable to convert from span<char, ...>
to span<const char, ...>
This is a smaller point than the others, but the member function that returns a char*
(which you would feed to a C API like DBus) is called assume_z()
. What's being assumed when the constructor of zstring_span
expects a null-terminated range?
If zstring_span
is designed "to convert zero-terminated spans to legacy strings", why does its use here seem so cumbersome? Am I misusing it? Is there something I'm overlooking?
Upvotes: 2
Views: 1669
Reputation: 16771
- it also seems odd that a literal (which is guaranteed to be null-terminated) isn't implicitly convertible to a
zstring_span
A string literal is of type const char[...]
. There is no information in the type that this const char
array is a null terminated string. Here is some other code with the same types, but without null termination where ensure_z
will fail fast.
const char foo_arr[4]{ 'o', 'd', 'd', '-' };
ensure_z(foo_arr);
Both "foo"
and foo_arr
are of type const char[4]
, but only the string literal is null terminated while foo_arr
is not.
Please note that your combination of ensure_z
and czstring_span<>
compiles, but it does not work. ensure_z
returns only the string without the terminating null byte. When you pass that to the czstring_span<>
constructor, then the constructor will fail searching for the null byte (which was cut off by ensure_z
).
You need to convert the string literal to a span and pass that to the constructor:
czstring_span<> foo = ensure_span("odd");
- There's a similar oddity with
std::string
, which is implicitly convertible tostring_span
, but notzstring_span
Good point. There is a constructor for string_span
that takes a std::string
, but for zstring_span
there is only a constructor taking the internal implementation type, a span<char>
. For span
there is a constructor taking a "container" having .data()
and .size()
- which std::string
implements. Even worse: the following code compiles but will not work:
zstring_span<> to_zspan(std::string& s) { return zstring_span<>{s}; }
You should consider filing an issue in the GSL repo to get the classes aligned. I am not sure if the implicit conversions are a good idea, so I prefer how it is done in zstring_span
over how string_span
does it.
- There seems to be some const-correctness issues.
Also here my first idea of czstring_span<> to_czspan(const std::string& s) { return czstring_span<>{s}; }
compiles but does not work. Another solution would be a new function ensure_cz
that returns a span<const char, ...>
. You should consider filing an issue.
assume_z()
The existance of empty()
and the code in as_string_span()
suggest that the class was meant to be able to handle empty string spans. In that case as_string_span
would always return the string without terminating null byte, ensure_z
would return the string with terminating null byte, failing if empty, and assume_z
would assume that !empty()
and return the string with terminating null byte.
But the one and only constructor is taking a non-empty span of characters, so empty()
can never be true
. I just created a PR to address these inconsistencies. Please consider filing an issue if you think that more should be changed.
If
zstring_span
is designed "to convert zero-terminated spans to legacy strings", why does its use here seem so cumbersome? Am I misusing it? Is there something I'm overlooking?
In pure C++ code I prefer std::string_view
, zstring_span
is only for C interop, that limits its use. And of course you must know the guidelines and the guideline support library. Given that I bet that zstring_span
is rarely been used and that you are one of the very few people taking a deep look into it.
Upvotes: 3
Reputation: 474236
It's "cumbersome" in part because it's intended to be.
This:
zstring_span<> to_zspan(std::string& s) { return ensure_z(s); }
Is not a safe operation. Why? Because while it is true that s
is NUL terminated, it is entirely possible that the actual s
contains internal NUL characters. That's a legitimate thing you can do with std::string
, but zstring_span
and whomever takes it can't handle that. They'll truncate the string.
By contrast, string_span/view
conversions are safe from this perspective. Consumers of such strings take a sized string and thus can handle embedded NULs.
Because the zstring_span
conversion is unsafe, there should be some explicit notation that something potentially unsafe is being done. ensure_z
represents that explicit notation.
Another problem is that C++ has no mechanism to tell the difference between a literal string argument and any old const char*
or const char[]
parameter. Since a bare const char*
may or may not be a string literal, you have to assume that it isn't and therefore use a more verbose conversion.
Also, C++ string literals can contain embedded NUL characters, so the above reasoning applies.
The const
issue seems like a code bug, and you should probably file it as such.
Upvotes: 1