Reputation: 2871
I have accepted an answer to the question below, but It seemed I misunderstood how Arrays in haskell worked. I thought they were just beefed up lists. Keep that in mind when reading the question below.
I've found that monolithic arrays in haskell are quite inefficient when using them for larger arrays.
I haven't been able to find a non-monolithic implementation of arrays in haskell. What I need is O(1) time look up on a multidimensional array.
Is there an implementation of of arrays that supports this?
EDIT: I seem to have misunderstood the term monolithic. The problem is that it seems like the arrays in haskell treats an array like a list. I might be wrong though.
EDIT2: Short example of inefficient code:
fibArray n = a where
bnds = (0,n)
a = array bnds [ (i, f i) | i <- range bnds ]
f 0 = 0
f 1 = 1
f i = a!(i-1) + a!(i-2)
this is an array of length n+1
where the i'th field holds the i'th fibonacci number. But since arrays in haskell has O(n) time lookup, it takes O(n²) time to compute.
Upvotes: 2
Views: 471
Reputation: 71119
Referring specifically to your fibArray
example, try this and see if it speeds things up a bit:
-- gradually calculate m-th item in steps of k
-- to prevent STACK OVERFLOW , etc
gradualth m k arr
| m <= v = pre `seq` arr!m
where
pre = foldl1 (\a b-> a `seq` arr!b) [u,u+k..m]
(u,v) = bounds arr
For me, for let a=fibArray 50000
, gradualth 50000 10 a
ran at 0.65 run time of just calling a!50000
right away.
Upvotes: 1
Reputation: 137987
You're confusing linked lists in Haskell with arrays.
Linked lists are the data types that use the following syntax:
[1,2,3,5]
defined as:
data [a] = [] | a : [a]
These are classical recursive data types, supporting O(n) indexing and O(1) prepend.
If you're looking for multidimensional data with O(1) lookup, instead you should use a true array or matrix data structure. Good candidates are:
Upvotes: 8
Reputation: 28097
Arrays have O(1) indexing. The problem is that each element is calculated lazily. So this is what happens when you run this in ghci:
*Main> :set +s
*Main> let t = 100000
(0.00 secs, 556576 bytes)
*Main> let a = fibArray t
Loading package array-0.4.0.0 ... linking ... done.
(0.01 secs, 1033640 bytes)
*Main> a!t -- result omitted
(1.51 secs, 570473504 bytes)
*Main> a!t -- result omitted
(0.17 secs, 17954296 bytes)
*Main>
Note that lookup is very fast, after it's already been looked up once. The array
function creates an array of pointers to thunks that will eventually be calculated to produce a value. The first time you evaluate a value, you pay this cost. Here are a first few expansions of the thunk for evaluating a!t
:
a!t -> a!(t-1)+a!(t-2)-> a!(t-2)+a!(t-3)+a!(t-2) -> a!(t-3)+a!(t-4)+a!(t-3)+a!(t-2)
It's not the cost of the calculations per se that's expensive, rather it's the need to create and traverse this very large thunk.
I tried strictifying the values in the list passed to array
, but that seemed to result in an endless loop.
One common way around this is to use a mutable array, such as an STArray. The elements can be updated as they're available during the array creation, and the end result is frozen and returned. In the vector package, the create
and constructN
functions provide easy ways to do this.
-- constructN :: Unbox a => Int -> (Vector a -> a) -> Vector a
import qualified Data.Vector.Unboxed as V
import Data.Int
fibVec :: Int -> V.Vector Int64
fibVec n = V.constructN (n+1) c
where
c v | V.length v == 0 = 0
c v | V.length v == 1 = 1
c v | V.length v == 2 = 1
c v = let len = V.length v
in v V.! (len-1) + v V.! (len-2)
BUT, the fibVec
function only works with unboxed vectors. Regular vectors (and arrays) aren't strict enough, leading back to the same problem you've already found. And unfortunately there isn't an Unboxed instance for Integer
, so if you need unbounded integer types (this fibVec
has already overflowed in this test) you're stuck with creating a mutable array in IO
or ST
to enable the necessary strictness.
Upvotes: 5