camino
camino

Reputation: 10594

what is the fastest way to replace multiple words in a very long string simultaneously

Suppose I have a string which contains more than 10,000 words, for example, it is the contents of the famous fiction "The old man and Sea"

and a dictionary which have 1,000 words pairs,for example,

before,after
--------------
fish , net
black, white
good, bad
....
....
round,rect

so what I want to do is ,according to the dictionary ,replace all 'fish' in the string with 'net', 'black' with 'white' ....

the simplest and intuitive algorithm is :

foreach(line, dict)
   str.replace(line.before,line.after)

but it was so inefficiency.

one improvement I can think of is to separate the string to multiple small string, then use multithread to handle each smallstring respectively ,then combine the result.

is there any other ideas?

btw, I am using QT

Upvotes: 3

Views: 1303

Answers (1)

hank
hank

Reputation: 9873

I think it's a better idea to have a vector of 10k words, not a string of characters.

Just like this:

QVector<QString> myLongString;

Your dictionary can be implemented as a hash table:

QHash<QString, QString> dict;

This will provide constant access time to your dictionary words:

QString replaceWith = dict.value("fish") // replaceWith == "net"

Then you can iterate through your vector and replace words:

for (int i=0; i < myLongString.size(); ++i)
{
    QString word = myLongString[i];
    if dict.contains(word)
    {
        myLongString[i] = dict.value(word);
    }
}

Upvotes: 2

Related Questions