Evg
Evg

Reputation: 3080

Haskell fast construction of vector from input of space separated ints

If I expect a string with space separated ints from user input (number of expected int's known), I can transfer it in haskell Data.Vector like this:

main = do
        n <- readLn -- this is number of ints        
        l' <- getLine
        let a = (read :: String -> Int) <$> words l' 
        return $ V.fromList a

But I want to do this as fast as possible, w/o intermediate list, what is a fastest way to construct Vector in this case?

ps. I am think about looping getChar + replicateM but not shure will it be faster or not.

Upvotes: 2

Views: 89

Answers (1)

K. A. Buhr
K. A. Buhr

Reputation: 50864

The following should be pretty fast.

{-# LANGUAGE BangPatterns #-}

import Data.Char
import qualified Data.Vector.Unboxed as V
import qualified Data.ByteString.Lazy as BL

numbers :: Int -> BL.ByteString -> V.Vector Int
numbers n = V.unfoldrExactN n $ go 0
  where go !acc bs = case BL.uncons bs of
          Just (c, bs') | c' >= zero && c' <= nine -> go (acc*10+c'-zero) bs'
                        | otherwise -> (acc, bs')
            where c' = fromIntegral c
                  zero = ord '0'
                  nine = ord '9'
          Nothing -> (acc, BL.empty)

main = do
  n <- readLn
  inp <- BL.getContents
  print $ V.sum (numbers n inp)

Note that this version does no input validation. It just parses runs of digits separated by single non-digit delimiters, and it'll misread repeated delimiters as additional zero values.

Compiled with GHC 8.10.4 on my machine, it processes about half a gigabyte per second and outperforms a similar GCC-compiled C version by about 10%, which is a little suspect, but I don't see anything obviously wrong with my C version:

#include <stdio.h>

long v[100000000];

int
main()
{
        long n;
        scanf("%ld\n", &n);
        for (int i = 0; i < n; ++i) {
                long acc = 0;
                int c;
                while ((c=getchar())>='0')
                        acc = acc*10+c-'0';
                v[i] = acc;
        }
        long total = 0;
        for (int i = 0; i < n; ++i) {
                total += v[i];
        }
        printf("%ld\n",total);
        return 0;
}

Upvotes: 2

Related Questions