Reputation: 10956
Given is a text file (for piping) with many numbers divided by a space, like so:
234 456 345 ...
What is the best way to read them all into a Data.Vector.Unboxed.Vector Int64
? My current code looks like this:
import Control.Applicative
import Control.Arrow
import Data.Int
import Data.Maybe
import qualified Data.ByteString.Char8 as B
import qualified Data.Vector.Unboxed as V
main :: IO ()
main = do
v <- readInts <$> B.getContents
print $ V.maximum v
-- splitted for profiling
readInts :: B.ByteString -> V.Vector Int64
readInts = a >>> b >>> c >>> d
a = B.split ' '
b = mapMaybe (B.readInt >>> liftA fst)
c = map fromIntegral
d = V.fromList
Here is the profiler output
Thu Sep 18 16:08 2014 Time and Allocation Profiling Report (Final)
FastReadInts +RTS -p -K800M -RTS
total time = 0.51 secs (505 ticks @ 1000 us, 1 processor)
total alloc = 1,295,988,256 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
d Main 74.3 5.2
b Main 9.9 35.6
a Main 6.3 40.0
main Main 4.8 0.0
c Main 3.2 19.3
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 60 0 0.4 0.0 100.0 100.0
main Main 121 0 4.8 0.0 98.2 100.0
readInts Main 123 0 0.0 0.0 93.5 100.0
a Main 131 0 6.1 40.0 6.1 40.0
b Main 129 0 9.9 35.6 9.9 35.6
c Main 127 0 3.2 19.3 3.2 19.3
d Main 125 0 74.3 5.2 74.3 5.2
CAF Main 119 0 0.0 0.0 0.2 0.0
a Main 130 1 0.2 0.0 0.2 0.0
b Main 128 1 0.0 0.0 0.0 0.0
c Main 126 1 0.0 0.0 0.0 0.0
d Main 124 1 0.0 0.0 0.0 0.0
readInts Main 122 1 0.0 0.0 0.0 0.0
main Main 120 1 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD 103 0 0.6 0.0 0.6 0.0
CAF GHC.IO.Encoding 96 0 0.2 0.0 0.2 0.0
CAF GHC.IO.Handle.Internals 93 0 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal 83 0 0.2 0.0 0.2 0.0
CAF GHC.IO.Encoding.Iconv 81 0 0.2 0.0 0.2 0.0
The programm is compiled and run this way:
ghc -O2 -prof -auto-all -rtsopts FastReadInts.hs
./FastReadInts +RTS -p -K800M < many_numbers.txt
many_numbers.txt is about 14MB large.
How can this bottleneck, i.e. V.fromList
, be removed?
Upvotes: 1
Views: 176
Reputation: 12908
Just wanted to note that pre-allocating mutable vector does not impact performance too much. In most cases run time will be dominated by reading file.
I have benchmarked both versions on 2^23
numbers and it seems that pre-allocated mutable array is even a bit slower.
benchmarking V.fromList
time 49.51 ms (47.65 ms .. 51.07 ms)
0.998 R² (0.995 R² .. 1.000 R²)
mean 48.24 ms (47.82 ms .. 49.01 ms)
std dev 971.5 μs (329.1 μs .. 1.438 ms)
benchmarking listToVector
time 109.9 ms (106.2 ms .. 119.9 ms)
0.993 R² (0.975 R² .. 1.000 R²)
mean 109.3 ms (107.6 ms .. 113.8 ms)
std dev 4.041 ms (1.149 ms .. 6.129 ms)
And here is the code of the benchmark:
import Control.Applicative
import Control.Monad (zipWithM_)
import System.IO.Unsafe
import Data.Int
import qualified Data.ByteString.Char8 as B
import qualified Data.Vector.Unboxed as V
import qualified Data.Vector.Unboxed.Mutable as M
import Criterion.Main
main :: IO ()
main = do
let readInt x = let Just (i,_) = B.readInt x in fromIntegral i
nums <- map readInt . B.words <$> B.readFile "randoms.txt"
defaultMain
[bench "V.fromList" $ whnf (V.maximum . V.fromList) nums
,bench "listToVector" $ whnf (V.maximum . listToVector) nums
]
listToVector :: [Int64] -> V.Vector Int64
listToVector ls = unsafePerformIO $ do
m <- M.unsafeNew (2^23)
zipWithM_ (M.unsafeWrite m) [0..(2^23)-1] ls
V.unsafeFreeze m
Upvotes: 1
Reputation: 64750
It is hard to answer questions like this without some expected level of performance or point of comparison. By simply omitting the profiling your code runs in 100ms over an ASCii file of 21MB of random 64-bit numbers, this seems reasonable to me.
$ time ./so < randoms.txt
9223350746261547498
real 0m0.109s
user 0m0.094s
sys 0m0.013s
And the generation of the test data:
import System.Random
main = do
g <- newStdGen
let rs = take (2^20) $ randomRs (0,2^64) g :: [Integer]
writeFile "randoms.txt" $ unwords (map show rs)
EDIT:
As requested:
import Data.Vector.Unboxed.Mutable as M
...
listToVector :: [Int64] -> V.Vector Int64
listToVector ls = unsafePerformIO $ do
m <- M.unsafeNew (2^20)
zipWithM_ (M.unsafeWrite m) [0..(2^20)-1] ls
V.unsafeFreeze m
Upvotes: 2