Why does translating this code snippet from C# to C++ degrade performance?

Question

I am much more familiar with C# than C++ so I must ask for advice on this issue. I had to rewrite some code pieces to C++ and then (surprisingly) ran into performance issues.

I've narrowed the problem down to these snippets:

C#

   public class SuffixTree
    {
        public class Node
        {
            public int Index = -1;
            public Dictionary Children = new Dictionary();
        }

        public Node Root = new Node();
        public String Text;

        public SuffixTree(string s)
        {
            Text = s;
            for (var i = s.Length - 1; i >= 0; --i)
                InsertSuffix(s, i);
        }

        public void InsertSuffix(string s, int from)
        {
            var cur = Root;
            for (int i = from; i < s.Length; ++i)
            {
                var c = s[i];
                if (!cur.Children.ContainsKey(c))
                {
                    var n = new Node() { Index = from };
                    cur.Children.Add(c, n);

                    return;
                }
                cur = cur.Children[c];
            }
        }

        public bool Contains(string s)
        {
            return FindNode(s) != null;
        }

        private Node FindNode(string s)
        {
            var cur = Root;
            for (int i = 0; i < s.Length; ++i)
            {
                var c = s[i];
                if (!cur.Children.ContainsKey(c))
                {
                    for (var j = i; j < s.Length; ++j)
                        if (Text[cur.Index + j] != s[j])
                            return null;
                    return cur;
                }
                cur = cur.Children[c];
            }
            return cur;
        }
    }
}

C++

struct node
{
    int index;
    std::unordered_map children;

    node() { this->index = -1; }
    node(int idx) { this->index = idx; }
};

struct suffixTree
{
    node* root;
    char* text;

    suffixTree(char* str)
    {
        int len = strlen(str) + 1;
        this->text = new char[len];
        strncpy(this->text, str, len);

        root = new node();
        for (int i = len - 2; i >= 0; --i)
            insertSuffix(str, i);
    }

    void insertSuffix(char* str, int from)
    {
        node* current = root;
        for (int i = from; i < strlen(str); ++i)
        {
            char key = str[i];
            if (current->children.find(key) == current->children.end())
            {
                current->children[key] = new node(from);
                return;
            }
            current = current->children[key];
        }
    }

    bool contains(char* str)
    {
        node* current = this->root;
        for (int i = 0; i < strlen(str); ++i)
        {
            char key = str[i];
            if (current->children.find(key) == current->children.end())
            {
                for (int j = i; j < strlen(str); ++j)
                    if (this->text[current->index + j] != str[j])
                        return false;
                return true;
            }
            current = current->children[key];
        }
    }
}

In both cases I'm creating a suffix tree and later using it in a much bigger function which is not relevant for the post (lets call it F()). I've tested both on two randomly generated strings of length 100000. The C# version constructed my suffix tree and used it in F() in a total execution time of: 480 ms while the code which I've "translated to C++" executed in 48 sec

I've drilled this down further and it seems that in my C++ code, the constructor takes 47 sec while using the tree in F() runs at 48 ms which is 10 times faster than in C#.

Conclusion

It seems that the main problem is in insertSuffix(), perhaps my lack of knowledge and understanding of the unordered_map structure. Can anyone shed a light on this? Did I make some rookie mistake in the C++ variant which causes the object construction to take so long?

Aditional Info

I've compiled both the C# and C++ program for maximum speed /O2 (release)

Daniel H · Accepted Answer

In C#, a System.String includes its Length, so you can get the length in constant time. In C++, a std::string also includes its size, so it is also available in constant time.

However, you aren’t using C++ std::string (which you should be, for a good translation of the algorithm); you’re using a C-style null-terminated char array. That char* literally means “pointer to char”, and just tells you where the first character of the string is. The strlen function looks at each char from the one pointed to forward, until it finds a null character '\0' (not to be confused with a null pointer); this is expensive, and you do it in each iteration of your loop in insertSuffix. That probably accounts for at least a reasonable fraction of your slowdown.

When doing C++, if you find yourself working with raw pointers (any type involving a *), you should always wonder if there’s a simpler way. Sometimes the answer is “no”, but often it’s “yes” (and that’s getting more common as the language evolves). For example, consider your struct node and node* root. Both use node pointers, but in both cases you should have used node directly because there is no need to have that indirection (in the case of node, some amount of indirection is necessary so you don’t have each node containing another node ad infinitum, but that’s provided by the std::unordered_map).

A couple other tips:

In C++ you often don’t want to do any work in the body of a constructor, but instead use initialization lists.
When you don’t want to copy something you pass as a parameter, you should make the parameter a reference; instead of changing insertSuffix to take a std::string as the first parameter, make it take std::string const&; similarly, contains should take a std::string const&. Better yet, since insertSuffix can see the text member, it doesn’t need to take that first parameter at all and can just use from.
C++ supports a foreach-like construct, which you should probably prefer to a standard for loop when iterating over a string’s characters.
If you’re using the newest not-technically-finalized-but-close-enough version of C++, C++17, you should use std::string_view instead of std::string whenever you just want a look at a string, and don’t need to change it or keep a reference to it around. This would be useful for contains, and since you want to make a local copy in the text member, even for the constructor; it would not be useful in the text member itself, because the object being viewed might be temporary. Lifetime can sometimes be tricky in C++, though, and until you get the hang of it you might just want to use std::string to be on the safe side.
Since node isn’t useful outside of the concept of suffixTree, it should probably be inside it, like in the C# version. As a deviation from the C# version, you might want to make the type node and the data members root and text into private instead of public members.

Why does translating this code snippet from C# to C++ degrade performance?

Answers (1)

Related Questions