Majkeee
Majkeee

Reputation: 1282

Performance of C++ std::unordered_map vs Kotlin/Java HashMap

I have created Andorid project to measure some equivalent pices of code and this one had me stuck on why? Why is C++ 3x slower here?

I already made some tweaks to C++ as immediate try to emplace and then push_back was slower than current approach, but still. Complexities here should be same, right?

Kotlin: 139,945,691 ns (139 ms)

C++:   347,100,764 ns (347 ms)

Kotlin:

data class Record(
    val year: Int,
    val month: Int, 
    val day: Int, 
    var temperature: Double
)

val records = ArrayList<Record>()
val map = HashMap<String, ArrayList<Record>>()

records.forEach { map.getOrPut("${it.year}-${it.month}") { ArrayList() }.add(it) }

C++

typedef struct record {
    int year;
    int month;
    int day;
    double temperature;
} Record;

std::vector<Record> records;
std::unordered_map<std::string, std::vector<Record>> map;

for (const auto &record : records) {
    const std::string & key = std::to_string(record.year) + " " + std::to_string(record.month);
    const auto & it = map.find(key);

    if (it == map.end()) {
        map.emplace_hint(it, key, std::vector<Record>())->second.push_back(record);
    } else {
        it->second.push_back(record);
    }
}

// Edit

Broader C++ code: https://pastebin.com/KqD02pSD

Broader Kotlin code: https://pastebin.com/iG7hCqHT


Important Edit

I have changed key of maps to Int - [year * 100 + month]. And results are still similar; 3x slower.

Upvotes: 2

Views: 2120

Answers (1)

Michael Kenzel
Michael Kenzel

Reputation: 15951

It's always hard to say why exactly a certain piece of code performs the way it does without properly profiling. Especially when talking about C++ where there's no standard implementation and, thus, the quality of your compiler and library implementation can vary significantly depending on which toolset you're using (examples: 1, 2). Furthermore, the interface that the standard requires for unordered_map, unfortunately, severely limits what the implementation underneath can do (check out this talk for more about that). It is generally known that std::unordered_map is, unfortunately, not necessarily the best hash table one could hope for (some more on that).

Apart from the performance issues of std::unordered_map itself, there's a few minor (most likely insignificant) things that may be putting your C++ code at an additional disadvantage here. First of all, you're building the key string and then copy it into the map. It would likely be more efficient to move the string. Also, std::to_string is potentially more costly than the conversion performed by Kotlin string interpolation because std::to_string is forced to observe the current locale while Kotlin string interpolation converts to a fixed format. It generally seems rather wasteful to use strings as keys here, as also already pointed out in the comments to your question.

I would suggest to use map.try_emplace() rather than map.emplace_hint() and std::to_chars() instead of std::to_string(). Apart from that, I wouldn't be surprised if the Kotlin HashTable is just a generally more efficient container than std::unordered_map could ever be due to the limitations set by its interface…

All that being said, I'm not sure what exactly you're attempting to achieve with this benchmark. At the end of the day, what you're doing here is compare the performance of two randomly chosen hash table implementations. You will never be able to pick one over the other as they exist in two completely separate ecosystems, so whatever the results of this benchmark are, they don't seem very…useful!?

Upvotes: 2

Related Questions