Rolando
Rolando

Reputation: 62624

Is sorting or inserting more efficient with joining arrays of objects in javascript?

I have multiple objects with a field called "num". Num can be any number between 1000000000 to 10000000005. I want to ensure that if I have x number of lists, all lists need to be combined in array1 in sorted ascending order based on the "num" property.

If I start with an array like this "

array1": [{item:23532532, num:1000000520},{item:23523, num:1000000620},{item:346346432, num:1000000620}]

I have a second array

"array2": [{item:23532, num:....},{item:3623, num:....}]

Assuming array2 is in sorted order by "num", is it more efficient to:

1) Add Then Sort Whole - Loop through every item in "array2" and add it to the end of "array1", then do a javascript built in "sort" function on the "num" property on the entire array?

2) Insert Into Right Place - Loop through every item in "array2" and use "if" conditions to check to see if the "num" value is greater than the current item in "array2", if so, then insert the element before that index via "splice". (No javascript built-in array sort used)

Or is there a more efficient method? Pseudocode or sample code is a plus.

Upvotes: 4

Views: 615

Answers (3)

jfriend00
jfriend00

Reputation: 707318

I measured the results from three different algorithms in three different browsers.

Two things are true about all performance related questions:

  1. If you really want to know the answer, you've got to test your specific algorithm in several browsers to really answer the question.

  2. Many performance related questions are actually not material in the given context in which they are being used so worrying about them until you know you need to worry about them is nothing more than time wasted on premature optimization or even unnecessary optimization. So, before working on a specific area of performance, you should know that it matters and know that it's worth spending time on.

That said, here are some measurements for three algorithms. This assumes you start with two arrays of objects, each sorted independently by one particular numeric property that is present in each object.

Here's the jsperf: http://jsperf.com/concat-sort-vs-insert-sort/5 that has the code for each of the three algorithms.

Algorithm 1 is concatenate and then sort the concatenated array. In JS, it's nothing more than:

var result = arr1.concat(arr2);
result.sort(sortByNum);

Algorithm 2 is an attempt at an insertion sort. The basic idea is to loop through the second array and for each item in that array, find where to insert it into the first array. Since both arrays are sorted, we only have to start looking for a place to insert the next item into the first array at the place after where we inserted the last item.

Algorithm 3 is a merge sort. The idea here is that you create an empty result array and two indexes, one for each of the two source arrays. For the value at each source index, you push into the result the lower of the two items and then increase its source index. When either source index is exhausted, you push in the rest of the other array. I was guessing that it would be more efficient than the insertion sort because it doesn't have to insert items into the middle of an array, just add onto the end which is likely a faster operation.


To run a test, I created two arrays that each contain 100 objects. Each object has a numeric property that is assigned a random number between 0 and 100,000. Each of the two source arrays are then pre-sorted. Each algorithm is then tested on these two source arrays.

And, here are the results:

enter image description here

Here's the code for the merge sort algorithm:

function mergeSort(arr1, arr2) {
    var result = [];
    var index1 = 0;
    var index2 = 0;
    if (!arr1.length) {
        return arr2.slice(0);
    } else if (!arr2.length) {
        return arr1.slice(0);
    }
    while (true) {
        if (arr1[index1].num <= arr2[index2].num) {
            result.push(arr1[index1]);
            ++index1;
            // see if we reached the end of the array
            if (index1 >= arr1.length) {
                result.push.apply(result, arr2.slice(index2));
                break;
            }
        } else {
            result.push(arr2[index2]);
            ++index2;
            // see if we reached the end of the array
            if (index2 >= arr2.length) {
                result.push.apply(result, arr1.slice(index1));
                break;
            }
        }
    }
    return result;
}

Working demo: http://jsfiddle.net/jfriend00/mja1c13d/

Upvotes: 5

ThisClark
ThisClark

Reputation: 14823

Though I don't think this will match your data perfectly well and is therefore likely not useful to you, it is worth mentioning a counting sort executes in linear time. If you knew in advance that you needed to sort integers without gaps in the data such that an object exists for every integer between 1000000000 and 10000000005 and all integers are unique, then you can exploit this by doing a math calculation to reduce 1000000000 to position zero, 1000000001 to position 1, etc... you get there by subtracting 1000000000 from each number to get the index in the zero-based array.

You end up with something like

var array = new Array(9000000005); //10000000005 - 1000000000
var obj = {"item":23532532, "num":1000000520};
var array[1000000520 - 1000000000] = obj; //goes into index 520

The subtraction operation brings your linear time complexity to 2n instead of just n. The built in Javascript sort is likely n log n time complexity. From 0 to 100 on an x-y coordinate graph, n log n is better, but at x > 100, the 2n scales ahead in performance. For 9 billion items however unrealistic this actually is for Javascript, you can see the difference over time will be drastically improved.

It's tough to capture the scale of this graph, but this gives a good idea of the comparison... y=x represents counting sort, y=2x represents counting sort with a subtraction operation. y = x log x likely represents the built-in Javascript array sort.

enter image description here

Upvotes: 2

S McCrohan
S McCrohan

Reputation: 6693

The built-in sort is likely to be substantially more efficient than writing your own insertion-sort in most cases. If that weren't so, then the native Array.sort() would be written to do this itself.

However, you could be looking at one of the exceptions, depending on the size of the two arrays. If you know that one of the two arrays is already sorted, AND the other (unsorted) array is short, AND the total size of the two arrays is large, then iterating through the short array and inserting into the large, sorted one may be more efficient.

Inserting (given that you know one list is already sorted) is going to be on the order of N1 * N2; concatenating and sorting is probably something like (N1 + N2) log (N1 + N2). Depending on those two relative sizes, it could go either way.

But unless your lists are very large, the difference is going to be below the threshold of humanly noticeable, so code maintainability argues for 'concatenate and sort'. Actual real performance measurements trump guesses, always, and performance is very subject to specifics of the data - so if you're really concerned, write both and test them using real data.

Upvotes: 3

Related Questions