cubetwo1729
cubetwo1729

Reputation: 1516

Is array access always constant-time / O(1)?

From Richard Bird, Pearls of Functional Algorithm Design (2010), page 6:

For a pure functional programmer, an update operation takes logarithmic time in the size of the array. To be fair, procedural programmers also appreciate that constant-time indexing and updating are only possible when the arrays are small.

Under what conditions do arrays have non-constant-time access? Is this related to CPU cache?

Upvotes: 4

Views: 1908

Answers (4)

Vikas Gupta
Vikas Gupta

Reputation: 4465

Can oranges be red?

Yes, they can be red due to a number of reasons -

  • You color them red.
  • You grow a genetically modified variety.
  • You grow them on Mars, the red planet, where every thing is supposed to look red.
  • The (theoretical) list of some practical (given todays technology) and impractical (fiction / or future reality) goes on...

Point is, that I think the question you are asking, is really about two orthogonal concepts. Namely -

Big O Notation - "In mathematics, big O notation describes the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions."

vs

Practicalities (hardware and software) a good software engineer should be aware of, while architecting / designing their app and writing code.

In other words, while the concept of Big O Notation can be called academic, but it is most appropriate way of classifying algorithms complexity (Time / Space).. and that's where it ends. There is no need to muddy the waters with orthogonal concerns.

To be clear, I am not saying that one should not be aware of the under the hood implementation details and workings of things, which affect the performance of the software you write.. but there is no point of mixing the two together. For example, does it make sense to say -

Arrays do not have constant time access (with indexes) because -

  • Large arrays do not fit in CPU cache, and hence incur high cost of cache misses.
  • On a system under memory pressure, the array, big or small, has been swapped out from Physical Memory to Hard Disk, and not only is impacted by a cache miss, but also a hard page fault.
  • On a system under extreme CPU load, the thread which read the supposed array can be pre-empted, and may not get a chance to execute for several seconds.
  • On a hypothetical OS, which backs its memory not just with disk, but with additional memory on another computer on the other corner of the world, will make array access un-imaginably slow.

Like my apple and orange example, as you read through my increasingly absurd examples, hope the point I am trying to make is clear.

Conclusion - Any day, I'd answer the question "Do Arrays have constant time O(1) access (with indexes)", as yes.. without any doubt or ifs and buts, they do.


EDIT:

Put it another way - If O(1) is not the answer.. then neither is O(log n), or O(n log n), or O(n^2) or O(n ^ 3)..... and certainly not 42.

Upvotes: 2

Ira Baxter
Ira Baxter

Reputation: 95354

Most modern machine architectures try to offer small unit time access to memory.

They fail. Instead we see layers of caches with differing speeds.

The problem is simple: the speed of light. If you need an enormous memory [array] (in the extreme, imagine a memory the size of the Andromeda galaxy), it will take enormous space, and light cannot cross enormous space in short periods of time. Information cannot travel faster than the speed of light. You are screwed by physics from the start.

So the best you can do is build part of memory "nearby" where light takes only fractions of nanosecond to traverse (thus registers and L1 cache), and part of the memory far away (the disk drive). Other practical complications ensue, such as capacitance (think inertia) to slow down access to things further away.

Now, if you are willing to take the access time of your farthest memory element as "unit" time, yes, access to everything takes the same amount of time, e.g., O(1). In practical computing, we treat RAM memory this way most of the time, and we leave out other, slower devices to avoid screwing up our simple model.

Then you discover people that aren't satisfied with that, and voila, you have people optimizing for cache line access. So it may be O(1) in theory, and acts like O(1) for small arrays (that fit in the first level of cache), but it often is not in practice.

An extreme practical case is an array that doesn't fit in main memory; now an array access may cause paging from the disk.

Sometimes we don't care even in that case. Google is essentially a giant cache. We tend to think of Google searches as O(1).

Upvotes: 6

Andras
Andras

Reputation: 3055

if a language supports sparse arrays, access to the array would have to go through a directory, and a tree-structured directory would have non-linear access time. Or did you mean realistic conditions? ;-)

Upvotes: 0

Nicola Bizzoca
Nicola Bizzoca

Reputation: 1135

He is talking about Computation Models, and in particular the word-based RAM machine

A RAM machine is a formalization of something of very similar to an actual computer: we model the computer memory as a big array of memory words of w bits each, and we can read/write any words in O(1) time

But we have yet something important to define: how large should a word be?

We need a word size w ≥ Ω(log n) to be able at least to address the n parts of our input. For this reason, word-based RAMs normally assume a word length of O(log n)

But having the word length of your machine depends on the size of the input appears strange and unrealistic

What if we keep the word length fixed? Then even following a pointer needs Ω(log n) time just to read the entire pointer

We need Ω(log n) words to store pointer and Ω(log n) time to access input element

Upvotes: 1

Related Questions