Jelco Adamczyk
Jelco Adamczyk

Reputation: 107

Qt - search string in 200k-words dictionary

So, I have this text file (generated with Aspell) with 200 000 words in it. It's going to be used for a crabble game, to check if the word is legit. That means that, most likely, there's going to be quite a lot of checks where the word is not in there, and I was wondering what the most efficient way would be.

  1. Checking the textfile line per line is going to take 200 000 iterations per check, so that'd be my last choice.

  2. Getting all the words in a QList, and use the Qlist::contain() function (or QList::indexOf() since I think i'm using Qt4.8). I don't know about the efficiency of that though, and there's going to be quite a lot of memory used.

  3. Using a hashtable. I honestly am not sure how that works, so if anyone could tell wether there are Qt data types provided, I can do some research.

Are there any other, efficient methods? Currently leaning to the QList method, seems easiest to implement :)

Upvotes: 1

Views: 530

Answers (3)

Jongware
Jongware

Reputation: 22478

Sort the list -- a one time operation: save it sorted, or sort it when starting your program -- and use a binary search. Looking up any word in 200,000 items is going to take on average 17.6 lookups, with about the four first operations only having to check a single character.

Upvotes: 0

Mats Petersson
Mats Petersson

Reputation: 129524

Assuming the hash is good, using a hashtable will definitely be the fastest method (as it's a simple computation of the hash - since the string is probably not very long, that shouldn't take much time - typical English words are around 5 characters long).

There is an example in the QHash section of this page for how to hash a string: http://doc.qt.digia.com/qq/qq19-containers.html

Upvotes: 1

Erbureth
Erbureth

Reputation: 3423

You can use std::unordered_set, it performs lookups via hash table. Qt has it's own implementation of it QSet

Do not use QList or the first file traversal method, as both are orders of magnitude slower than one hashtable lookup.

Upvotes: 1

Related Questions