killexe
killexe

Reputation: 466

Why is my Java Merge-Sort faster than my C++ implementation?

I implemented the Merge Sort in Java and in C++ and I tried to implement them as similar as possible. Both algorithms work, I tested them many times. The problem is that my Java-Implementation is a lot faster than my C++-Implementation and I am wondering why. I can't believe that Java would be faster so I guess I made a mistake in one of the implementations. In order to measure the runtime I created a class "Person" which has two String-Attributes (forename, lastname). In C++ I used std::vector<Person*> and in Java I used ArrayList<Person>. Additionally I overloaded the operator< in C++ to compare two Persons (compare the lastname, if equal compare the firstname). In Java I implemented the interface Comparable<Person> to compare two Persons.

Can you find a mistake in my code or a reason why Java would be faster or C++ would be slower? Any help would be appreciated.

My Java Code:

public void mergeSort(List<T> list) {
    if (list.size() <= 1) {
        return;
    }

    int subLength = (int) (list.size() / 2);
    List<T> first = new ArrayList<T>(list.subList(0, subLength));
    List<T> second = new ArrayList<T>(list.subList(subLength, list.size()));


    mergeSort(first);
    mergeSort(second);

    merge(first, second, list);
    return;
}

private void merge(List<T> first, List<T> second, List<T> result) {
    int firstPos = 0, secondPos = 0, resultPos = 0;

    while (firstPos < first.size() && secondPos < second.size()) {
        if (first.get(firstPos).compareTo(second.get(secondPos)) < 0) {
            result.set(resultPos, first.get(firstPos));
            firstPos++;
        } else {
            result.set(resultPos, second.get(secondPos));
            secondPos++;
        }
        resultPos++;
    }

    for (int i = firstPos; i < first.size(); i++) {
        result.set(resultPos, first.get(i));
        resultPos++;
    }
    for (int i = secondPos; i < second.size(); i++) {
        result.set(resultPos, second.get(i));
        resultPos++; 
    }
}

My C++ Code:

Note: I used two template-methods to make the mergesort usable with both Person and Person*.

template<typename T>
    T * ptr(T & obj) { return &obj; } 

    template<typename T>
    T * ptr(T * obj) { return obj; }


void mergeSort(std::vector<T> &list) {
        if (list.size() <= 1) {
            return;
        }

        int subLength = (int)(list.size() / 2);
        std::vector<T> first(list.begin(), list.begin() + subLength);
        std::vector<T> second(list.begin() + subLength, list.end());

        mergeSort(first);
        mergeSort(second);

        merge(first, second, list);

    }
void merge(const std::vector<T> &first, const std::vector<T> &second, std::vector<T> &result) {
        int firstPos = 0, secondPos = 0, resultPos = 0;
        while (firstPos < first.size() && secondPos < second.size()) {
            if (*ptr(first[firstPos]) < *ptr(second[secondPos])) {
                result[resultPos] = first[firstPos];
                firstPos++;
            }
            else {
                result[resultPos] = second[secondPos];
                secondPos++;
            }
            resultPos++;
        }

        for (int i = firstPos; i < first.size(); i++) {
            result[resultPos] = first[i];
            resultPos++;
        }
        for (int i = secondPos; i < second.size(); i++) {
            result[resultPos] = second[i];
            resultPos++;
        }
    }

Edit1 and 2:

My setup-configuration:

I used one million, 10 millionen and 20 millionen persons to test the implementations. I doesn't really matter with how many Persons I test it with, Java is always faster.

And I test it the following way: I create persons and initialize my MergeSort-class. Then I start the measurement and I call my mergeSort-method. When the sorting is finished I stop the measurement. (Deleting is happening after the time measurement). For the time measurement in Java I use System.nanoTime() and in C++ I use chrono::high_resolution_clock::time_point

Of course I compiled C++ in "Release"-Mode (Optimization: O2, faster code preferred).

My Test-PC:

Edit3:

There is one thing I forgot to mention. I implemented the algorithm in a generic way in order to use simple data-types as well as objects. When I use std::vector<int> and ArrayList<Integer> in Java then my C++ implementation is a lot faster. My first attempt was to use std::vector<Person> but it was even slower. So my guess was that it made deep copies instead of shallow ones and that's why I switched to Person* because I thought that when copying is happening only the pointers will be copied.

Upvotes: 4

Views: 761

Answers (2)

OldCurmudgeon
OldCurmudgeon

Reputation: 65803

TL;DR - The Java version does alot less array copying.

The ArrayList constructor (see line 167) of ArrayList when passed a Collection uses Collection.toArray() and optionally Arrays.copyOf if required. In the case of an ArrayList there is no need to take a copy - toArray() returns a reference to the underlying array.

Note also that the if (elementData.getClass() != Object[].class) will cause the array not to be copied again.

The Java List.subList on ArrayList objects does not copy any of the underlying array, it just returns an ArrayList backed by the original but limited to the elements required.

As a result - in may cases the whole mechanism uses sub-lists that actually just refer to the original array - no copying is required.

Not too familiar with C++ but I suspect there is a lot of array copying and allocation going on which just doesn't need to be done by Java.

ADDED - As @ThomasKläger rightly pointed out, ArrayList.toArray actually does return a defensive copy of the array - so I was wrong above.

Upvotes: 6

Jonathan Mee
Jonathan Mee

Reputation: 38919

One of the 1st things I see is in your statement:

Deleting is happening after the time measurement

You're speaking of deleting of the Person objects, you're obviously not speaking of the containers, such as first and second which C++ is creating and cleaning on the stack:

std::vector<T> first(list.begin(), list.begin() + subLength);
std::vector<T> second(list.begin() + subLength, list.end());

while Java is creating them on the heap and not cleaning them up till it gets around to it (so after you stop timing):

List<T> first = new ArrayList<T>(list.subList(0, subLength));
List<T> second = new ArrayList<T>(list.subList(subLength, list.size()));

So you're timing C++ with container cleanup and Java without.


I have to ask here, what's the point of writing your own merge sort? The best Java and C++ code would use the sorting algorithms already provided by the language. If you're looking to time something at least time the optimized algorithms.

Also I wouldn't put a lot of work into time comparisons. C++ will be faster, it will generally be more work to write as well. If speed is important enough to you to bother with timing you probably want to be using C++. If development time is king then you'll want to be using Java.

Upvotes: 3

Related Questions