Hazim
Hazim

Reputation: 381

Finding common letters between 2 strings in Python

For a homework assignment, I have to take 2 user inputted strings, and figure out how many letters are common (in the same position of both strings), as well as find common letters.. For example for the two strings 'cat' and 'rat', there are 2 common letter positions (which are positions 2 and 3 in this case), and the common letters are also 2 because 'a' is found one and 't' is found once too..

So I made a program and it worked fine, but then my teacher updated the homework with more examples, specifically examples with repetitive letters, and my program isn't working for that.. For example, with strings 'ahahaha' and 'huhu' - there are 0 common letters in same positions, but there's 3 common letters between them (because 'h' in string 2 appears in string 1, three times..)

My whole issue is that I can't figure out how to count if "h" appears multiple times in the first string, as well as I don't know how to NOT check the SECOND 'h' in huhu because it should only count unique letters, so the overall common letter count should be 2..

This is my current code:

S1 = input("Enter a string: ")
S2 = input("Enter a string: ")
i = 0
big_string = 0
short_string = 0
same_letter = 0
common_letters = 0

if len(S1) > len(S2):
    big_string = len(S1)
    short_string = len(S2)
elif len(S1) < len(S2):
    big_string = len(S2)
    short_string = len(S1)
elif len(S1) == len(S2):
    big_string = short_string = len(S1)

while i < short_string:
    if (S1[i] == S2[i]) and (S1[i] in S2):
        same_letter += 1
        common_letters += 1
    elif (S1[i] == S2[i]):
        same_letter += 1
    elif (S1[i] in S2):
        common_letters += 1
    i += 1

print("Number of positions with the same letter: ", same_letter)
print("Number of letters from S1 that are also in S2: ", common_letters)

So this code worked for strings without common letters, but when I try to use it with "ahahaha" and "huhu" I get 0 common positions (which makes sense) and 2 common letters (when it should be 3).. I figured it might work if I tried to add the following:

while x < short_string:
    if S1[i] in S2[x]:
        common_letters += 1
    else:
        pass
    x += 1

However this doesn't work either...

I am not asking for a direct answer or piece of code to do this, because I want to do it on my own, but I just need a couple of hints or ideas how to do this..

Note: I can't use any functions we haven't taken in class, and in class we've only done basic loops and strings..

Upvotes: 2

Views: 18688

Answers (5)

viswadeep
viswadeep

Reputation: 11

We can solve this by using one for loop inside of another as follows

int y=0;
for(i=0;i<big_string ;i++)
   {
     for(j=0;j<d;j++)
        {
           if(s1[i]==s2[j])
           {y++;}
        }

If you enter 'ahahaha' and 'huhu' this code take first character of big string 'a' when it goes into first foor loop. when it enters into second for loop it takes first letter of small string 'h' and compares them as they are not equal y is not incremented. In next step it comes out of second for loop but stays in first for loop so it consider first character of big string 'a' and compares it against second letter of small string 'u' as 'j' is incremented even in this case both of them are not equal and y remains zero. Y is incremented in the following cases:-

  1. when it compares second letter of big string 'h' and small letter of first string y is incremented for once i,e y=1;
  2. when it compares fourth letter of big string 'h' and small letter of first string y is incremented again i,e y=2;
  3. when it compares sixth letter of big string 'h' and small letter of first string y is incremented again i,e y=3;

Final output is 3. I think that is what we want.

Upvotes: 0

Mayur Buragohain
Mayur Buragohain

Reputation: 1615

A shorter version is this:

def gen1(listItem):
    returnValue = []
    for character in listItem:
        if character not in returnValue and character != " ":
            returnValue.append(character)
    return returnValue

st = "first string"
r1 = gen1(st)
st2 = "second string"
r2 = gen1(st2)

if len(st)> len(st2):
    print list(set(r1).intersection(r2))
else:
    print list(set(r2).intersection(r1))

Note: This is a pretty old post but since its got new activity,I posted my version.

Upvotes: 1

ferhatelmas
ferhatelmas

Reputation: 3978

You need a data structure like multidict. To my knowledge, the most similar data structure in standard library is Counter from collections.

For simple frequency counting:

>>> from collections import Counter
>>> strings = ['cat', 'rat']
>>> counters = [Counter(s) for s in strings]
>>> sum((counters[0] & counters[1]).values())
2

With index counting:

>>> counters = [Counter(zip(s, range(len(s)))) for s in strings]
>>> sum(counters[0] & counters[1].values())
2

For your examples ahahaha and huhu, you should get 2 and 0, respectively since we get two h but in wrong positions.

Since you can't use advanced constructs, you just need to simulate counter with arrays.

  • Create 26 elements arrays
  • Loop over strings and update relevant index for each letter
  • Loop again over arrays simultaneously and sum the minimums of respective indexes.

Upvotes: 1

TML
TML

Reputation: 12976

You are only getting '2' because you're only going to look at 4 total characters out of ahahaha (because huhu, the shortest string, is only 4 characters long). Change your while loop to go over big_string instead, and then add (len(S2) > i) and to your two conditional tests; the last test performs an in, so it won't cause a problem with index length.

NB: All of the above implicitly assumes that len(S1) >= len(S2); that should be easy enough to ensure, using a conditional and an assignment, and it would simplify other parts of your code to do so. You can replace the first block entirely with something like:

if (len(S2) > len(S1)): (S2, S1) = (S1, S2)
big_string = len(S1)
short_string = len(S2)

Upvotes: 0

Adam Peterson
Adam Peterson

Reputation: 1

Since you can't use arrays or lists,

Maybe try to add every common character to a var_string then test if c not in var_string: before incrementing your common counter so you are not counting the same character multiple times.

Upvotes: 0

Related Questions