How to use lists and loops to count the occurrences of dinucleotide pairs?

Question

I have a DNA text file and I need to specifically use lists and loops to count the occurrences of dinucleotide pairs (ex: AA, AC, AT, AG, CA, CC... etc) then use lists and loops again to print the counts to a new text file as a table with two columns separated by a tab: the dinucleotide sequence and the count. I know how to do this the long way (store each pair in variables then count occurrences using count, then open text file and print each individual counts to text file) but I am just now starting to learn about lists and loops and confused on how I would do it that way.

ex: this is how I do it:

dna1.txt is my (random) example of a dna sequence text file on my computer:

random sequence (i.e. dna1.txt):

agggaatcgctggtgaagaggttgtgacctcttataaccccattgttaatgaggtccacg ctaagtaatgagtggctggtataggtgacgtctagaagtcatttctgtacagttactgcc gtggatatatccattaggacgacactggggtgctcccacgcaccacgtgtacaggacgac tgcgatgatatagaaggtgagcttaaaacgttctacaaccccaatgaatcatagccgggt agattgccaggcgtgtggtaacgggtacgtggcggatctcgtccagtatgccgcagtcac acccgaatctttcgtcgactacggagcgactcgtatcgagacgggcttgaattgactcct catggattaggctgaggtcaaccttcgcatggagcctgggcatttaaaggtcgactgtcg

dna_txt = open("dna1.txt")
dna_txtcontents = dna_txt.read()
aa_count = dna_txtcontents.count("aa")
print str(aa_count)

then continue for each pair then store each individual count in a new text file but how do I make it easier for myself by using lists and loops to both count occurrences of each pair then store counts in a new text file? Oh and also making sure that the program would work whether the sequence is uppercase or lowercase?

Thank you!!

How to use lists and loops to count the occurrences of dinucleotide pairs?

Answers (1)

Related Questions