Reputation: 13853
I'm looking for a way to find the longest word ( base on length ) in a text document using STL and boost.
Here is my solution. However, it wasn't good at all, there were too many operations( token, sort .. ). Is there any simpler way to solve this problem?
// utility and memory
#include <utility>
#include <functional>
#include <memory>
#include <locale>
#include <string>
// data structure and algorithm
#include <stack>
#include <vector>
#include <list>
#include <set>
#include <map>
#include <deque>
#include <list>
#include <bitset>
#include <algorithm>
#include <iterator>
// numeric
#include <complex>
#include <numeric>
#include <valarray>
// input/output
#include <iostream>
#include <iomanip>
#include <ios>
#include <iosfwd>
#include <streambuf>
#include <sstream>
// standard C
#include <cctype>
#include <cmath>
#include <climits>
#include <cstdlib>
#include <ctime>
#include <cassert>
#include <cstring>
// boost
#include <boost/tokenizer.hpp>
int main() {
std::string str = "Test, test!, test...string";
boost::char_separator<char> sep( ",!,.-" );
boost::tokenizer<boost::char_separator<char> > tokens( str, sep );
std::vector<std::string> res;
std::copy( tokens.begin(), tokens.end(), std::back_inserter( res ) );
std::sort( res.begin(), res.end(), []( const std::string& l, const std::string& r ) { return l.length() > r.length(); } );
std::cout << "longest : " << *res.begin() << "\n";
return 0;
}
Best regards,
Chan
Upvotes: 2
Views: 2258
Reputation: 6375
You can use std::max_element
. Just give to it iterator pair and the comparator which you've already written.
Upvotes: 3
Reputation: 2801
That's one way, and it works, but there's one problem with it: it runs in O(n lg n), which is the complexity of the sorting algorithm.
An O(n) trivial algorithm would be to scan the tokens one by one and keep track of the longest word at each step. At the end of the loop you'll have one of the longest words in the set. I said one because there may be more than one words with the same length. This way you'd only capture the first you encounter.
You might want to modify the algorithm in order to keep track of all the longest words seen at each step.
Upvotes: 2
Reputation: 1587
Rather than storing and sorting the entire array of word tokens, you can just find the length of each token and compare it with the max_length seen so far. This will reduce your space requirements and also eliminate sorting computational complexity.
Upvotes: 1