Reputation: 28699
Consider the following constexpr
function, static_strcmp
, which uses C++17's constexpr
char_traits::compare
function:
#include <string>
constexpr bool static_strcmp(char const *a, char const *b)
{
return std::char_traits<char>::compare(a, b,
std::char_traits<char>::length(a)) == 0;
}
int main()
{
constexpr const char *a = "abcdefghijklmnopqrstuvwxyz";
constexpr const char *b = "abc";
constexpr bool result = static_strcmp(a, b);
return result;
}
godbolt shows this gets evaluated at compile-time, and optimised down to:
main: xor eax, eax ret
Remove constexpr
from bool result
:
If we remove the constexpr
from constexpr bool result
, now the call is no longer optimised.
#include <string>
constexpr bool static_strcmp(char const *a, char const *b)
{
return std::char_traits<char>::compare(a, b,
std::char_traits<char>::length(a)) == 0;
}
int main()
{
constexpr const char *a = "abcdefghijklmnopqrstuvwxyz";
constexpr const char *b = "abc";
bool result = static_strcmp(a, b); // <-- note no constexpr
return result;
}
godbolt shows we now call into memcmp
:
.LC0: .string "abc" .LC1: .string "abcdefghijklmnopqrstuvwxyz" main: sub rsp, 8 mov edx, 26 mov esi, OFFSET FLAT:.LC0 mov edi, OFFSET FLAT:.LC1 call memcmp test eax, eax sete al add rsp, 8 movzx eax, al ret
Add a short circuiting length
check:
if we first compare char_traits::length
for the two arguments in static_strcmp
before calling char_traits::compare
, without constexpr
on bool result
, the call is optimised away again.
#include <string>
constexpr bool static_strcmp(char const *a, char const *b)
{
return
std::char_traits<char>::length(a) == std::char_traits<char>::length(b)
&& std::char_traits<char>::compare(a, b,
std::char_traits<char>::length(a)) == 0;
}
int main()
{
constexpr const char *a = "abcdefghijklmnopqrstuvwxyz";
constexpr const char *b = "abc";
bool result = static_strcmp(a, b); // <-- note still no constexpr!
return result;
}
godbolt shows we're back to the call being optimised away:
main: xor eax, eax ret
constexpr
from the initial call to static_strcmp
cause the constant evaluation to fail?constexpr
, the call to char_traits::length
is evaluated at compile time, so why not the same behaviour without constexpr
in the first version of static_strcmp
?Upvotes: 3
Views: 228
Reputation: 30010
Your program has undefined behavior, because you always compare strlen(a)
characters. The string b
doesn't have that much characters.
If you modify your strings to be equal length (so your program becomes well-defined), your program will be optimised as you expect.
So this is not missed optimization. The compiler would optimize your program, but because it contains undefined behavior, it doesn't optimize it.
Note, that whether it is undefined behavior or not, is not super clear. Considering that the compiler uses memcmp
, it thinks that both of the input strings must be at least strlen(a)
long. So according to the behavior of the compiler, it is undefined behavior.
Here's what the current draft standard says about compare:
Returns: 0 if for each i in [0, n), X::eq(p[i],q[i]) is
true
; else, a negative value if, for some j in [0, n), X::lt(p[j],q[j]) istrue
and for each i in [0, j) X::eq(p[i],q[i]) istrue
; else a positive value.
Now, it is not specified whether compare
is allowed to read p[j+1..n)
or q[j+1..n)
(where j
is the index of the first difference).
Upvotes: 2
Reputation: 66230
We have three working cases:
1) the computed value is required to initialize a constexpr
value or where a compile-time-known value is strictly required (not-type template parameter, size of a C-style array, a test in a static_assert()
, ...)
2) the constexpr
function uses value not compile-time-known (by example: values received from standard input.
3) the constexpr
function receive values compile-time-known but the result goes in a place not compile-time required.
If we ignore the as-if rule, we have that:
in case (1) the compiler must compute the value compile-time because the computed value is required compile-time
in case (2) the compiler must compute the value run-time because it's impossible compute it compile-time
in case (3) we are in a grey area where the compiler can compute the value compile-time but the computed value isn't strictly required compile-time; in this case the compiler can choose if compute compile-time or run-time.
With the initial code
constexpr bool result = static_strcmp(a, b);
you are in case (1): the compiler must compute compile-time because the result
variable is declared constexpr
.
Removing the constexpr
,
bool result = static_strcmp(a, b); // no more constexpr
your code translate in the grey area (case (3)), where compile-time computation is possible but not strictly required, because the input values are known compile time (a
and b
) but the result
goes where the value isn't compile-time required (an ordinary variable). So the compiler can choose and, in your case, choose the run-time computation with a version of the function, compile-time computation with another version.
Upvotes: 4
Reputation: 62613
Note, that nothing in the standard explicitly requires constexpr
function to be called at compile time, see 9.1.5.7 in latest draft:
A call to a constexpr function produces the same result as a call to an equivalent non-constexpr function in all respects except that (7.1) a call to a constexpr function can appear in a constant expression and (7.2) copy elision is not performed in a constant expression ([class.copy.elision]).
(emphasizes mine)
Now, when the call appears in constant expression, there is no way compiler can avoid running the function at compile time, so it dutifully obliges. When it does not (as in your second snippet) it is just a case of missing optimization. There is no shortage of those around here.
Upvotes: 3