Magnus Eriksson
Magnus Eriksson

Reputation: 11

Common character count giving unexpected problems

I have a simple problem where you get two strings where you need to see how many characters are common.

For s1 = "aabcc" and s2 = "dacaa", the output should be
solution(s1, s2) = 3.

I've seen a lot of people solve these types of problems, but I don't want to copy and paste. I want to learn it my way with my solution and improve upon that. I think my solution makes sense but it obviously doesn't, because it doesn't work.

This is my solution:

#include <iostream>

using namespace std;

int main()
{
    
        int counter {};
        string s1 = "aabcc";
        string s2 = "dacaa";
    
        for (int i {}; i < s1.size(); ++i)
        {
            for (int j {}; j < s2.size(); ++j)
            {
                if (s1.at(i) == s2.at(j));
                {
                    cout << "Found character " << s1.at(i) << " at index " << i << " and " << s2.at(j) << " at index " << j << endl;
                    counter++;
                    s2.erase(s2.begin()+j);
                    break;
                }
            }
            
            cout << s2 << endl;
        }
        
        cout << counter << endl;
    
        return 0;
    }

This is the output I get (I did a lot of "cout" for debugging purposes.

Found character a at index 0 and d at index 0
acaa
Found character a at index 1 and a at index 0
caa
Found character b at index 2 and c at index 0
aa
Found character a at index 3 and a at index 0

Found character a at index 4 and a at index 0

5

Let's talk about the first iteration. What I don't understand is how s1.at(i) is equal to s2.at(j) at index 0? s1.at(0) is "a" and s2.at(0) is "d". Since when was a = d? This problem is driving me crazy and I would appreciate if you could explain why my solution acts like this.

Upvotes: 0

Views: 183

Answers (2)

Vlad from Moscow
Vlad from Moscow

Reputation: 311088

There is a typo

if (s1.at(i) == s2.at(j));

Remove the semicolon.

In general the approach is bad because it changes one of the source strings.

I can suggest the following solution by means of using an additional container.

#include <iostream>
#include <string>
#include <string_view>
#include <utility>
#include <map>
#include <iterator>
#include <algorithm>
#include <numeric>

size_t common_letters( std::string_view s1, std::string_view s2 )
{
    std::map<char, std::pair<size_t, size_t>> m;

    for (char c : s1)
    {
        if ( m.count( c ) == 0 && s2.find( c ) != std::string_view::npos )
        {
            m[c].first  = std::count( std::begin( s1 ), std::end( s1 ), c );
            m[c].second = std::count( std::begin( s2 ), std::end( s2 ), c );
        }
    }

    return std::accumulate( std::begin( m ), std::end( m ), ( size_t )0,
        []( const auto &acc, const auto &item )
        {
            return acc + std::min( item.second.first, item.second.second );
        } );
}

int main()
{
    std::string s1 = "aabcc";
    std::string s2 = "dacaa";

    std::cout << "common_letters( s1, s2 ) = " << common_letters(s1, s2) << '\n';

    const char *s3 = "aabcc";
    const char *s4 = "dacaa";

    std::cout << "common_letters( s3, s4 ) = " << common_letters( s3, s4 ) << '\n';
}

The program output is

common_letters( s1, s2 ) = 3
common_letters( s3, s4 ) = 3

The if statement within the range-based for loop can be rewritten the following way to make the function more efficient

for (char c : s1)
{
    if ( auto pos = std::string_view::npos; 
         m.count( c ) == 0 && ( pos = s2.find( c ) ) != std::string_view::npos )
    {
        m[c].first  = std::count( std::begin( s1 ), std::end( s1 ), c );
        m[c].second = std::count( std::next( std::begin( s2 ), pos ), std::end( s2 ), c );
    }
}

Or if the compiler supports the C++ 20 then

for ( size_t i = 0; char c : s1)
{
    if ( auto pos = std::string_view::npos; 
         m.count( c ) == 0 && ( pos = s2.find( c ) ) != std::string_view::npos )
    {
        m[c].first  = std::count( std::next( std::begin( s1 ), i ), std::end( s1 ), c );
        m[c].second = std::count( std::next( std::begin( s2 ), pos ), std::end( s2 ), c );
    }
    i++;
}

Or instead of the range-based for loop there can be used an ordinary for loop with iterators. That is the function can look the following way

size_t common_letters( std::string_view s1, std::string_view s2 )
{
    std::map<char, std::pair<size_t, size_t>> m;

    for (  auto first = std::begin( s1 ), last = std::end( s1 ); first != last; ++first )
    {
        if ( auto pos = std::string_view::npos; 
             m.count( *first ) == 0 && ( pos = s2.find( *first ) ) != std::string_view::npos )
        {
            m[*first] = { std::count( first , last, *first ),
                          std::count( std::next( std::begin( s2 ), pos ), std::end( s2 ), *first ) };
        }
    }

    return std::accumulate( std::begin( m ), std::end( m ), ( size_t )0,
        []( const auto &acc, const auto &item )
        {
            return acc + std::min( item.second.first, item.second.second );
        } );
}

Upvotes: 2

The Coding Fox
The Coding Fox

Reputation: 1585

There is a small typo in your code as Vlad from Moscow said. The reason why that small semicolon ruins the result is that then the code:

if (s1.at(i) == s2.at(j));
{ // From here
    cout << "Found character " << s1.at(i) << " at index " << i << " and " << s2.at(j) << " at index " << j << endl;
    counter++;
    s2.erase(s2.begin()+j);
    break;
} // To here

...is not considered as the part of the if statement because a semicolon refers to the end of a statement. So the semicolon ended the if statement, and as the code in the brackets is after the if statement has been ended by the semicolon, it is not considered as a part of the if statement. I hope the explanation is clear :).

Upvotes: 0

Related Questions