Maximusrain
Maximusrain

Reputation: 33

Avoid reading punctuation from a file in C++

I'm trying to find the longest word on a file in c++. I have the solution for that but the code is also considering the punctuation and I don't know how to avoid this.

This is the function "get_the_longest_word()":

string get_the_longest_word(const string &file_name){
int max=0;
string s,longest_word;
ifstream inputFile(file_name);

if(inputFile.is_open())
{
    while(inputFile>>s)
    {
        if(s.length()>max)
        {
            max=s.length();
            s.swap(longest_word);
        }
    }
    inputFile.close();
}else
    cout<<"Error while opening the file!!\n";

return longest_word;}

Thanks in advance for the help

Upvotes: 0

Views: 524

Answers (1)

A M
A M

Reputation: 15277

In c++ we have since long a good method to specify patterns of characters, that form a word. The std::regex. It is very easy to use and very versatile.

A word, consisting of 1 or many alphanum characters can simply be defined as \w+. Nothing more needed. If you want other patterns, then this is also easy to create.

And for such programs like yours, there is also no complexity overhead or runtime issue with regexes. So, it should be used.

Additionally, we have a very nice iterator, with which we can iterate over such patterns in a std::string. The std::sregex_token_iterator. And this makes life really simple. With that, we can use many useful algorithms provided by C++.

For example std::maxelement which takes 2 iterators and then returns the max element in the given range. This is, what we need.

And then the whole program boils down to just a few simple statements.

Please see:

#include <iostream>
#include <fstream>
#include <string>
#include <iterator>
#include <regex>
#include <algorithm>

const std::regex re{ "\\w+" };

std::string getLongestWord(const std::string& fileName) {

    std::string result{};

    // Open the file and check, if it could be opened
    if (std::ifstream ifs{ fileName }; ifs) {

        // Read complete file into a string. Use range constructor of string
        std::string text(std::istreambuf_iterator<char>(ifs), {});

        // Get the longest word
        result = *std::max_element(std::sregex_token_iterator(text.begin(), text.end(), re), {}, [](const std::string& s1, const std::string& s2) {return s1.size() < s2.size(); });

    } // Error, file could not be opened
    else std::cerr << "\n*** Error. Could not open file '" << fileName << "'\n\n";
    
    return result;
}

int main() {
    std::cout << getLongestWord("text.txt") << '\n';
}

Upvotes: 1

Related Questions