Koen
Koen

Reputation: 49

Python - String search

Lets suppose I have these two strings:

IAMASTRIPETHA-IWANTTOIGN-RE

IAMA-TRIPETHATIWA-TTOIGNORE

If in this case I would ignore the positions of the '-', those two strings are the same. How can I accomplish this in Python 2.7?

IAMASTRIPETHA-IWANTTOIGN-RE

IAMA-TRIPETHATIWA-TTOALGORE

The above example is not similar when ignoring the '-'; hence I don't care.

Hope someone can help me out ;)

PS: Apologies for not mentioning this but it is not required for strings to have an equal length!

Upvotes: 1

Views: 146

Answers (3)

PM 2Ring
PM 2Ring

Reputation: 55499

It appears from the comments that you don't care if the string lengths don't match, so we don't need to test lengths, and we can use the built-in zip() rather than importing zip_longest().

s1 = 'IAMASTRIPETHA-IWANTTOIGN-RE'
s2 = 'IAMA-TRIPETHATIWA-TTOIGNORE'
s3 = 'IAMA-TRIPETHATIWA-TTOALGORE'

def ignore_dash_match(s1, s2):
    return all(c1 == c2 for c1, c2 in zip(s1, s2) if c1 != '-' and c2 != '-')

print ignore_dash_match(s1, s2), ignore_dash_match(s1, s3)

output

True False

Here's an alternative approach which converts each '-' to a "wildcard" object that compares equal to anything.

s1 = 'IAMASTRIPETHA-IWANTTOIGN-RE'
s2 = 'IAMA-TRIPETHATIWA-TTOIGNORE'
s3 = 'IAMA-TRIPETHATIWA-TTOALGORE'

class Any:
    def __eq__(self, other):
        return True

def dash_to_Any(s):
    return [Any() if c == '-' else c for c in s]

print dash_to_Any(s1) == dash_to_Any(s2), dash_to_Any(s1) == dash_to_Any(s3)  

output

True False

You could make that slightly more efficient by using a single instance of Any, rather than creating a fresh one every time. But for a better version of Any please see my answer to Searching for a partial match in a list of tuples.

And of course if you don't care about mismatched lengths you can do

def ignore_dash_match(s1, s2):
    return all(c1 == c2 for c1, c2 in zip(dash_to_Any(s1), dash_to_Any(s2)))

Upvotes: 1

Delgan
Delgan

Reputation: 19717

a = "IAMASTRIPETHA-IWANTTOIGN-RE"
b = "IAMA-TRIPETHATIWA-TTOIGNORE"

all(x==y or x=="-" or y=="-" for x, y in zip(a, b))
>> True

Upvotes: 4

Maroun
Maroun

Reputation: 96016

I won't write for you a full solution, but will guide you.

You need to get the indexes of the "-" in both strings, and replace by the empty string in each correspondingly.

In order to find the positions of "-" in a given string, you can have:

get_indexes(st):
     return [m.start() for m in re.finditer('-', st)]

Now you should replace these indexes in the other string by the empty string, and compare them.

This solution is robust and doesn't assume anything about length.

Upvotes: 0

Related Questions