Azia
Azia

Reputation: 17

Approximate pattern matching?

I am trying to write code for Approximate Pattern Matching which is as below:

def HammingDistance(p, q):
    d = 0
    for p, q in zip(p, q): # your code here
        if p!= q:
            d += 1
    return d
Pattern = "ATTCTGGA"
Text = "CGCCCGAATCCAGAACGCATTCCCATATTTCGGGACCACTGGCCTCCACGGTACGGACGTCAATCAAAT"
d = 3
def ApproximatePatternMatching(Pattern, Text, d):
    positions = [] # initializing list of positions
    for i in range(len(Text) - len(Pattern)+1):
        if Pattern == Text[i:i+len(Pattern)]:
            positions.append(i)# your code here
    return positions
print (ApproximatePatternMatching(Pattern, Text, d))

I keep getting the following error: Failed test #3. You may be failing to account for patterns starting at the first index of text.

Test Dataset:

GAGCGCTGG
GAGCGCTGGGTTAACTCGCTACTTCCCGACGAGCGCTGTGGCGCAAATTGGCGATGAAACTGCAGAGAGAACTGGTCATCCAACTGAATTCTCCCCGCTATCGCATTTTGATGCGCGCCGCGTCGATT
2

Your output:

['[]', '0']

Correct output:

['0', '30', '66']

Can not figure out what I am doing wrong as I am trying to learn python so don't have any idea about programming. Need help?

Upvotes: 1

Views: 1788

Answers (2)

niksy
niksy

Reputation: 445

def ApproximatePatternMatching(Pattern, Text, d):
    positions = [] 


    for i in range(len(Text)-len(Pattern)+1):
        x = Text[i:i+len(Pattern)+1]
        if x != Pattern:
            y = HammingDistance(Pattern,x)
            if y <= d:
                positions.append(i)
    return positions    




def HammingDistance(p, q):


   count = 0

   for i in range(len(p)):
       x = p[i]
       y = q[i]
       if x != y:
           count = count + 1
   return count             

Upvotes: 0

Stygies
Stygies

Reputation: 131

I'm unsure why you're getting an empty list as one of your outputs - when I run your code above I only get [0] as the print out.

Specifically, your code at present only checks for an exact character substring match, without using the hamming distance definition you also included.

The following should return the result you expect:

Pattern = "GAGCGCTGG"
Text = "GAGCGCTGGGTTAACTCGCTACTTCCCGACGAGCGCTGTGGCGCAAATTGGCGATGAAACTGCAGAGAGAACTGGTCATCCAACTGAATTCTCCCCGCTATCGCATTTTGATGCGCGCCGCGTCGATT"
d = 3

def HammingDistance(p, q):
    d = 0
    for p, q in zip(p, q): # your code here
        if p!= q:
            d += 1
    return d

def ApproximatePatternMatching(Pattern, Text, d):
    positions = [] # initializing list of positions
    for i in range(len(Text) - len(Pattern)+1):
        # and using distance < d, rather than exact matching
        if HammingDistance(Pattern, Text[i:i+len(Pattern)]) < d:
            positions.append(i)
    return positions

print (ApproximatePatternMatching(Pattern, Text, d))

Upvotes: 2

Related Questions