Yan Zhu
Yan Zhu

Reputation: 4346

Tree traversal in C++ and Haskell

I am new to Haskell. I am trying to get a sense how well haskell can handle the recursive function call together with their lazy evaluation. The experiement I made is simply building binary search tree in both C++ and Haskell and traverse them respectively in postorder. C++ implementation is standard one with auxilary stack. (I just print the element out once I visit it).

Here is my haskell code:

module Main (main) where

import System.Environment (getArgs)
import System.IO
import System.Exit
import Control.Monad(when)
import qualified Data.ByteString as S

main = do
     args <- getArgs
     when (length args < 1) $ do
          putStrLn "Missing input files"
          exitFailure

     content <- readFile (args !! 0)
     --preorderV print $ buildTree content
     mapM_ print $ traverse POST $ buildTree content
     putStrLn "end"


data BSTree a = EmptyTree | Node a (BSTree a) (BSTree a) deriving (Show)
data Mode = IN | POST | PRE

singleNode :: a -> BSTree a
singleNode x = Node x EmptyTree EmptyTree

bstInsert :: (Ord a) => a -> BSTree a -> BSTree a
bstInsert x EmptyTree = singleNode x
bstInsert x (Node a left right)
          | x == a = Node a left right
          | x < a  = Node a (bstInsert x left) right
          | x > a  = Node a left (bstInsert x right)

buildTree :: String -> BSTree String
buildTree = foldr bstInsert EmptyTree . words

preorder :: BSTree a -> [a]
preorder EmptyTree = []
preorder (Node x left right) = [x] ++ preorder left ++ preorder right

inorder :: BSTree a -> [a]
inorder EmptyTree = []
inorder (Node x left right) = inorder left ++ [x] ++ inorder right

postorder :: BSTree a -> [a]
postorder EmptyTree = []
postorder (Node x left right) = postorder left ++  postorder right ++[x]

traverse :: Mode -> BSTree a -> [a]
traverse x tree = case x of IN   -> inorder tree
                            POST -> postorder tree
                            PRE  -> preorder tree


preorderV :: (a->IO ()) -> BSTree a -> IO ()
preorderV f EmptyTree = return ()
preorderV f (Node x left right) = do 
                                     f x
                                     preorderV f left
                                     preorderV f right

My test result show that C++ significantly outperform Haskell:

C++ performance: (note that first15000.txt is roughly 5 times of first3000.txt)

time ./speedTestForTraversal first3000.txt > /dev/null 

real    0m0.158s
user    0m0.156s
sys     0m0.000s
time ./speedTestForTraversal first15000.txt > /dev/null 

real    0m0.923s
user    0m0.916s
sys     0m0.004s

Haskell with same input file:

time ./speedTestTreeTraversal first3000.txt > /dev/null 

real    0m0.500s
user    0m0.488s
sys     0m0.008s
time ./speedTestTreeTraversal first15000.txt > /dev/null 

real    0m3.511s
user    0m3.436s
sys     0m0.072s

What I expected that haskell should be not too far away from C++. Did I make some mistake? Is there any way to improve my haskell code?

Thanks

Edit: Oct. 18 2014

After testing serval occasions, haskell's traversal is still significantly slower than C++ implementation. I would like to give Cirdec's answer a full credit, since he points out inefficiency of my haskell implementation. However, my original question is to compare C++ and haskell implementation. So I would like to keep this question open and post my C++ code to encourage further discussion.

#include <iostream>
#include <string>
#include <boost/algorithm/string.hpp>
#include <fstream>
#include <stack>
using namespace std;
using boost::algorithm::trim;
using boost::algorithm::split;


template<typename T>
class Node
{
public:
    Node(): val(0), l(NULL), r(NULL), p(NULL) {};
    Node(const T &v): val(v), l(NULL), r(NULL), p(NULL) {}
    Node* getLeft() {return l;}
    Node* getRight(){return r;}
    Node* getParent() {return p;}
    void  setLeft(Node *n) {l = n;}
    void  setRight(Node *n) {r = n;}
    void  setParent(Node *n) {p = n;}
    T  &getVal() {return val;}
    Node* getSucc() {return NULL;}
    Node* getPred() {return NULL;}
private:
    T val;
    Node *l;
    Node *r;
    Node *p;
};

template<typename T>
void destoryOne(Node<T>* n)
{
    delete n;
    n = NULL;
}

template<typename T>
void printOne(Node<T>* n)
{
    if (n!=NULL)
    std::cout << n->getVal() << std::endl;
}




template<typename T>
class BinarySearchTree
{
public:
    typedef void (*Visit)(Node<T> *);

    BinarySearchTree(): root(NULL) {}
    void delNode(const T &val){};
    void insertNode(const T &val){
    if (root==NULL)
        root = new Node<T>(val);
    else {
        Node<T> *ptr = root;
        Node<T> *ancester = NULL;
        while(ptr && ptr->getVal()!=val) {
        ancester = ptr;
        ptr = (val < ptr->getVal()) ? ptr->getLeft() : ptr->getRight(); 
        }
        if (ptr==NULL) {
        Node<T> *n = new Node<T>(val);
        if (val < ancester->getVal())
            ancester->setLeft(n);
        else
            ancester->setRight(n);
        } // else the node exists already so ignore!
    }
    }
    ~BinarySearchTree() {
    destoryTree(root);
    }
    void destoryTree(Node<T>* rootN) {
    iterativePostorder(&destoryOne);
    }

    void iterativePostorder(Visit fn) {
    std::stack<Node<T>* > internalStack;
    Node<T> *p = root;
    Node<T> *q = root;
    while(p) {
        while (p->getLeft()) {
        internalStack.push(p);
        p = p->getLeft();
        }
        while (p && (p->getRight()==NULL || p->getRight()==q)) {
        fn(p);
        q = p;
        if (internalStack.empty())
            return;
        else {
            p = internalStack.top();
            internalStack.pop();
        }
        }
        internalStack.push(p);
        p = p->getRight();
    }
    }


    Node<T> * getRoot(){ return root;}
private:
    Node<T> *root;
};



int main(int argc, char *argv[])
{
    BinarySearchTree<string> bst;
    if (argc<2) {
    cout << "Missing input file" << endl;
    return 0;
    }
    ifstream inputFile(argv[1]);
    if (inputFile.fail()) {
    cout << "Fail to open file " << argv[1] << endl;
    return 0;
    }
    while (!inputFile.eof()) {
    string word;
    inputFile >> word;
    trim(word);
    if (!word.empty()) {
        bst.insertNode(word);
    }
    }

    bst.iterativePostorder(&printOne);

    return 0;
}

Edit: Oct 20 2014 Chris's answer in below is very thorough and I can repeat the result.

Upvotes: 3

Views: 1118

Answers (2)

CR Drost
CR Drost

Reputation: 9807

I generated a file containing all of the 4-letter strings on the lowercase ASCII alphabet abcdefghijklmnopqrstuvwxyz separated by spaces; and I think I got it in the right order so that the tree your code generates is perfectly balanced.

I chose this length because it takes 3.4 seconds on my computer, much like your 3.5s Haskell run. I called it 26_4.txt for obvious reasons. It sounds like actually your data set is close to 264 words, so it's comparable in length too.

A lower bound on runtime would be something like:

import System.IO
main = do
    mylist <- readFile "26_4.txt"
    mapM_ putStrLn (words mylist)

and this for my data set reduces to taking 0.4s (piping stdout to /dev/null of course). So we can't expect more than, say, a factor of 10 speedup on this sort of problem from Haskell, it looks like. However, that factor is well within the bounds for your problem; C++ takes twice as long as this super-simple program.

But, no processing is an unrealistic goal. We can get a bound which is more realistic if we use a data structure which was already optimized for us by professionals who understand Haskell better:

import System.IO
import qualified Data.Map.Strict as Map

balancedTree = Map.fromList . map (\k -> (k, ()))

serializeTree = map fst . Map.toList

main = do
    mylist <- readFile "26_4.txt"
    mapM_ putStrLn (serializeTree $ balancedTree $ words mylist)

This runs for something more like 1.6s on my machine. It's not quite as fast as your C++ but your C++ does not balance the tree as far as I can tell.

I made Cirdec's modification to the code and it took your code down to 3.1s, so it only shaved about 10% of that runtime off of the file.

However, on my machine, this file doesn't even run unless you give it more memory with RTSopts. And that points to a really important optimization: tail-call optimization. The code from both you and Cirdec is suboptimal in a special way: it is not tail-recursive, meaning that it cannot be turned into a loop by GHC. We can make it tail-recursive by writing an explicit stack of 'stuff to be done' which we descend into:

postorder :: BSTree a -> [a]
postorder t = go [] [t]
    where go xs [] = xs
          go xs (EmptyTree : ts) = go xs ts
          go xs (Node x a b : ts) = go (x : xs) (b : a : ts)

This change seems to bring it all the way down to 2.1s.

Another difference between C++ and Haskell which causes some amount of time wasted is that the Haskell version will allow you to lazily construct your search tree, whereas your C++ code wouldn't allow this. We can make the Haskell code strict to deal with this, providing something like:

data BSTree a = EmptyTree
          | Node  !a !(BSTree a) !(BSTree a) deriving (Show)

This change combined with Cirdec's brings us down to 1.1 seconds, which means that we're on-par with your C++ code, at least on my machine. You should test this on your machine to see if those are also the main problems. I think any further optimizations can't be done "from the armchair" and must instead be done with a proper profiler.

Remember to ghc -O2 the code, else the tail-calls and other optimizations may not take.

Upvotes: 3

Cirdec
Cirdec

Reputation: 24156

List concatenation with ++ is slow, each time ++ occurs, its first argument must be traversed to the end to find where to add the second argument. You can see how the first argument is traversed all the way to [] in the definition of ++ from the standard prelude:

(++) :: [a] -> [a] -> [a]
[]     ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)

When ++ is used recursively this traversal must be repeated for each level of recursion, which is inefficient.

There's another way to build lists: If you know what will come at the end of a list before you start to build it, you can build it with the end already in place. Let's look at the definition of postorder

postorder :: BSTree a -> [a]
postorder EmptyTree = []
postorder (Node x left right) = postorder left ++ postorder right ++ [x]

When we make postorder left, we already know what will come after it, it will be postorder right ++ [x], so it would make sense to build the list for the left side of the tree with the right side and the value from the node already in place. Similarly, when we make postorder right, we already know what should come after it, namely x. We can do exactly that by making a helper function that passes an accumulated value for the rest of the list

postorder :: BSTree a -> [a]
postorder tree = go tree []
    where
        go EmptyTree rest = rest
        go (Node x left right) rest = go left (go right (x:rest))

This is about twice as fast on my machine when run with a 15k word dictionary as input. Let's explore this a bit more to see if we can gain a deeper understanding. If we rewrite our postorder definition using function composition (.) and application ($) instead of nested parenthesis we'd have

postorder :: BSTree a -> [a]
postorder tree = go tree []
    where
        go EmptyTree rest = rest
        go (Node x left right) rest = go left . go right . (x:) $ rest

We can even drop the rest argument and the function application, $, and write this in a slightly more points-free style

postorder :: BSTree a -> [a]
postorder tree = go tree []
    where
        go EmptyTree = id
        go (Node x left right) = go left . go right . (x:)

Now we can see what we've done. We've replaced a list [a], with a function [a] -> [a] that prepends the list to an existing list. The empty list is replaced with the function that doesn't add anything to the start of a list, which is the identity function, id. The singleton list [x] is replaced with the function that adds x to the beginning of a list, (x:). List concatenation a ++ b is replaced with function composition f . g - first add the things g will add to the beginning of the list, then add the things that f will add to the beginning of that list.

Upvotes: 11

Related Questions