user2858650
user2858650

Reputation:

Building a Better ArrayList

When you insert into an ArrayList at the beginning or middle of the list you are always going to have O(n) performance but an insert at the end is able to maintain O(1) because you are just tacking it onto the end of the list.

I've got a high level idea of how to make an insert at the beginning O(1) and make the O(n) inserts on average of O(n/2) but I'm not sure it's possible with the underlying array structure.

My Idea

Since the ArrayList is actually just a larger array that "resizes" itself when it reaches a threshold why couldn't it resize itself at both the front and end of the list rather than only at the end? This would allow inserts at the beginning to be O(1) while also allowing you to rewrite an insert in the middle to take the shortest route to insert itself. (For instance if your inserting at index 1, it would be easier to move 0 over than to move 2 to n).

The answer I think to why Sun didn't do this originally is because of the way the Arrays.copyOf and by extension System.arrayCopy are written. Both start at the beginning of the list and copy into a larger array. I've taken a look at the Sun source code for these two methods but it gets a little advanced for my Java skill level.

Questions

In my mind there are two main questions as to whether this would work or not:

1) Firstly, can Arrays.copyOf or System.arrayCopy be rewritten/overloaded to insert a subarray into the middle of a larger array?

2) What would the Big O ramifications of doing that be?

So is this a good idea or am I missing something obvious that would make this unworkable?

Thanks in advance!

Upvotes: 2

Views: 123

Answers (3)

Stefan Haustein
Stefan Haustein

Reputation: 18803

Yes, you could make insertion at the begin as efficient as insertion at the end. You just need to start filling the backing array in the middle instead of at index 0. You don't need any different System.arraycopy for this, see example code below.

Instead of just storing the backing array (data) and the current size, you also store the current start offset, and take it into account accordingly.

BetterArrayList<T> extends AbstractList<T> {
  private int size;   // Used elements in data.
  private int start;  // Position of element 0 in data.
  private T[] data;   // data.length is the current total capacity.

  public BetterArrayList<T>(int initialCapacity) {
    data = new Object[initialCapacity];
    start = initialCapacity / 2;
    size = 0;
  }

  public void add(int index, T value) {
    if (index < size / 2) {  // Insert at the front or end?
      if (start == 0) {
        realloc(data.length * 2);
      }
      System.arraycopy(data, start, data, start - 1, index);
      start--;
    } else {
      if (size + start == data.length) {
        realloc(data.length * 2);
      }
      System.arraycopy(data, start + index, 
                       data, start + index + 1, size - index);
    }
    data[start + index] = value;
    size++;
  }    

  public T get(int index) {
    if (index < 0 || index >= size) {
      throw new IndexOutOfBoundsException();
    }
    return data[start + index];
  }

  private void realloc(int newCapacity) {
    T[] newData = new Object[newCapacity];
    int newStart = newCapacity - size / 2;
    System.arraycopy(data, start, newData, newStart, size);
    start = newStart;
    data = newData;
  }

  public T set(int index, T value) {
    Value old = get(index);
    data[size + index] = value;
    return old;
  }

  public int size() {
    return size;
  }
}

I would guess that Sun did not go this route because it wastes more memory in the typical use case for array lists (which is appending at the end). And you only gain more efficient insertions at the front, inserting anywhere else would still be O(n) because the contents still need to be shifted.

BTW: Depending on your use case, but you may be interested in the Rope data structure: http://en.wikipedia.org/wiki/Rope_%28data_structure%29

Upvotes: 1

mwhs
mwhs

Reputation: 5978

What you describe does not match the idea of an ArrayList.

The order of the elements of an ArrayList are defined by the order in which they are inserted (in which case you will always add them to the end, unless you specify an index of course).

If the order is defined by the value of the members of the objects and not the order of insertion, then you want a LinkedList.

Upvotes: 1

Marko Topolnik
Marko Topolnik

Reputation: 200166

Your idea is entirely workable and it would make a lot of sense for a deque implementation, where the common case is to both add and remove elements at both ends. It doesn't make much sense for an ArrayList because 99% of its use cases only ever add to the end.

As for your questions about the nature of System.arraycopy, I am not sure I fully understand you since

a) yes, you can use arraycopy to overwrite the middle of one array with the contents of another, and

b) arraycopy doesn't insert into an array because that operation has no meaning on Java's fixed-size arrays.

The closest to the idea of "inserting into an array" is creating a whole new, larger array and arraycopying the appropriate ranges from the two input arrays. The resulting performance is the best you can get.

Upvotes: 2

Related Questions