sina
sina

Reputation: 3

Counting repeated patterns from a text file in python?

I need to count a special pattern from a text file. My code is:

sequence = input ("Enter a sequence valid: ")
with open(original_file, 'r') as read_obj:
    for line in read_obj:
        count = 0
        for i in range(len(line)):
            if sequence.upper() == (line[i:i + len(sequence)].upper()):
                count += 1
        print(f"({count}) {line.upper()}", end=' ')

the outpu is:

Enter a sequence valid : ACA
(0) CATGTCGTAGCTAGCTACTGTACTATTATTATCTGGATCGTAC
 (0) CTATGCGATGCTGACGTATCTAGCTACGTATCGTAGCTGATCTATCGATCGTATCGA
 (0) CATGCTAGTCTAGCTAGCTAGCTAGCGTAGCTACTGAGTCGATC
 (3) ACACACCCCACATTCTCGTACGATTTTCGGCGCGGGGCGGCCTATTATCTGCAT
 (2) ACACAC
 (0) TGTGTG
 (15) ACACACACACACACACACACACACACACACAC
 (1) TAGACAGTCGATCGACTGCAGCTTCG
 (0) CCACCATGGGTGG
 (0) AAAAATTTT
 (0) GGGG
 (0) AAAA

I need to count the total of finding pattern for each line for example in this case is 21 for ACA. My text file is:

CATGTCGTAGCTAGCTACTGTACTATTATTATCTGGATCGTAC
CTATGCGATGCTGACGTATCTAGCTACGTATCGTAGCTGATCTATCGATCGTATCGA
CATGCTAGTCTAGCTAGCTAGCTAGCGTAGCTACTGAGTCGATC
ACACACCCCACATTCTCGTACGATTTTCGGCGCGGGGCGGCCTATTATCTGCAT
ACACAC
TGTGTG
ACACACACACACACACACACACACACACACAC
TAGACAGTCGATCGACTGCAGCTTCG
CCACCATGGGTGG
AAAAATTTT
GGGG
AAAA

Upvotes: 0

Views: 48

Answers (1)

Matus Dubrava
Matus Dubrava

Reputation: 14492

If you need to count all occurrences including the overlapping ones then you do something like this

s = """CATGTCGTAGCTAGCTACTGTACTATTATTATCTGGATCGTAC
CTATGCGATGCTGACGTATCTAGCTACGTATCGTAGCTGATCTATCGATCGTATCGA
CATGCTAGTCTAGCTAGCTAGCTAGCGTAGCTACTGAGTCGATC
ACACACCCCACATTCTCGTACGATTTTCGGCGCGGGGCGGCCTATTATCTGCAT
ACACAC
TGTGTG
ACACACACACACACACACACACACACACACAC
TAGACAGTCGATCGACTGCAGCTTCG
CCACCATGGGTGG
AAAAATTTT
GGGG
AAAA"""

def get_occurrences(s):
    counter = 0
    for i in range(len(s) - 3):
        if s[i:i+3] == "ACA": 
            counter += 1
    return counter

sum([get_occurrences(line) for line in s.split("\n")])

This will give you the desired 21.

Upvotes: 1

Related Questions