Reputation: 31
I am trying to make this function tail recursive so that I can use it to process a large number of events without getting a stack overflow. I made sure to put the recursive call at the last line of the function, but it still floods the call stack with recursive calls.
Is there anything else I need to do to make it tail recursive, or does my compiler just not know how to optimize it?
Should I abandon this function and use a loop instead?
template <class Csi>
void GetEvents(EventHandle handle, vector<int> desiredCodes, vector<EventHandle> &events, Csi &csi)
{
if (handle == INVALID_HANDLE)
{
return;
}
int code = csi.GetEventCode(handle);
bool codeSatisfiesSearch = (find(desiredCodes.begin(), desiredCodes.end(), code) != desiredCodes.end());
if (codeSatisfiesSearch)
{
events.push_back(handle);
handle = csi.FindNextEventEx(handle, &desiredCodes[0], 0, desiredCodes.size());
}
else
{
handle = csi.FindNextEventEx(handle, &desiredCodes[0], 0, desiredCodes.size());
}
return GetEvents(handle, desiredCodes, events, csi);
}
Upvotes: 3
Views: 387
Reputation: 15164
Modern compilers can perform this optimization if you turn it on with a compiler flag. I modified your code to the following, in order to get it to compile on godbolt.org:
#include <algorithm>
#include <stddef.h>
#include <vector>
class EventHandle {
public:
constexpr EventHandle() : dummy(~0U) {}
constexpr auto operator==(const EventHandle& x) const
{
return dummy == x.dummy;
}
private:
unsigned dummy;
};
constexpr EventHandle INVALID_HANDLE = EventHandle();
class Csi {
public:
int GetEventCode(EventHandle) const;
EventHandle FindNextEventEx( EventHandle, const int*, int, size_t ) const;
};
void GetEvents(EventHandle handle, const std::vector<int>& desiredCodes, std::vector<EventHandle>& events, Csi &csi)
{
if (handle == INVALID_HANDLE)
{
return;
}
int code = csi.GetEventCode(handle);
bool codeSatisfiesSearch = (std::find(desiredCodes.begin(), desiredCodes.end(), code) != desiredCodes.end());
if (codeSatisfiesSearch)
{
events.push_back(handle);
handle = csi.FindNextEventEx(handle, &desiredCodes[0], 0, desiredCodes.size());
}
else
{
handle = csi.FindNextEventEx(handle, &desiredCodes[0], 0, desiredCodes.size());
}
return GetEvents(handle, desiredCodes, events, csi);
}
I made only two significant changes. I provided stub class
definitions for EventHandle
and Csi
, so that the compiler would generate code at all. I also changed desiredCodes
and events
to reference parameters, so that the compiler would not make copies. (This changes the semantics: the function now alters the original events
object. You might wish to declare it as std::vector<EventHandle>&& events
to make it more explicit that you clobber the input parameter.) I also added std::
to STL types and algorithms.
MSVC 19.22 x64 optimizes this to tail-recursion if you give it the /Ox
flag. GCC 9.2 x86_64 requires -O2
or higher. Clang 8.0.0 and ICC 19.0.1 both require -O
or higher. For example, with -O
, line 213 of the ICC code listing optimizes to a jmp
.
jmp GetEvents(EventHandle, std::vector<int, std::allocator<int> > const&, std::vector<EventHandle, std::allocator<EventHandle> >&, Csi&) #44.12
Without optimization flags, line 1724 performs a recursive non-tail call.
call GetEvents(EventHandle, std::vector<int, std::allocator<int> > const&, std::vector<EventHandle, std::allocator<EventHandle> >&, Csi&) #44.12
Upvotes: 0
Reputation: 62613
Taking this question on it's face value.
In the current form, the code is not suitable for TCO, due to the fact that vector<int> desiredCodes
is passed by value. It requires caller to destruct the local vector after recursive call, so tail-call optimization is not an option.
When I changed the code to pass the vector by const reference, I noticed that latest version of clang
did optimize the tail call: https://gcc.godbolt.org/z/qM7QVv
However, gcc
still did not: https://gcc.godbolt.org/z/u7yIvR. I noticed that it is push_back
which prevents gcc from optimizing - when commented out, recursive call is eliminated.
I was able to get gcc 9.2 to optimize the recursive call when replacing events.push_back(handle)
with
events.resize(events.size() + 1);
events[events.size() - 1] = handle;
All this goes to show that TCO in C++ is not something to rely on, as it is extremely fragile and depends on unpredictable factors. It's a nice bonus you might get occasionally, but not something you can build your design on.
If TCO interests you (as it interests me, for example) you will have better luck with more predictable languages like C, or, even better, with functional-style bunch.
Upvotes: 7