ericmjl
ericmjl

Reputation: 14694

How come my double for loop doesn't work?

I have tried this double for loop that doesn't work. (See below.)

Basically, I have a list of constructs, and a list of primers. The primers are associated with the constructs by means of a "construct number" and a "part number". (Each construct is composed of multiple parts.) For each part, there is a "forward" and a "reverse" primer. For the molecular biology-inclined SO members out there, I'm basically writing a script to help me with PCRs.

What I'm trying to do is this: I want to search the list of primers for those primers that should be associated with the construct's part, and join them together into one master list. For example, if I have a list with EMP792 (fw) and EMP793 (re) inside it (they are on separate lines), and they are associated with construct #1's part #2 in my construct list, I want to be able to search the "primers_list" for the corresponding fw and re primers. If the construct's part does not have associated primers inside the list, I want to skip past those constructs first.

The strategy I used is this: I did a nested for loop. For each construct in the construct list, I wanted it to search through the primer list for the fw and re primers. I know this is inefficient, but as a beginner programmer, that's the only way I could come up with. I included some conditionals to check whether a primer existed for those constructs, by checking the construct number and part number associated with the primers.

The problem I'm facing is this: For each construct in the list, the loop doesn't search the entire primer_list. It seems to automatically jump past all primers previously compared, and only compare the next primer not yet compared. That's causing problems in the processing, in which if you run the code with the associated data sets (which I have also pasted below the code), you will find that a construct that should have an associated primer printed out doesn't have its associated primer, and it's causing me much headaches trying to figure out what's going wrong (lol, haha...)!

I'd appreciate any help please!

CODE:

with open('constructs-to-make-shortened2.csv', 'rU') as constructs:
    construct_list = csv.DictReader(constructs)

    with open('primers-with-notes-names.csv', 'rU') as primers:
        primers_list = csv.DictReader(primers)

        #make list of constructs for checking later on#
##        construct_numbers_list = []
##        for row in primers_list:
##            construct_numbers_list.append(row['construct number'])
##
##        print(construct_numbers_list)


        for construct in construct_list:
##            print('Currently at construct number ' + construct['Construct'])
##            print('Construct counter at ' + str(construct_counter))
##            print('Part number counter is at ' + str(part_number))
            master_row = {}
            master_row['construct'] = construct['Construct']
            master_row['strategy'] = construct['Strategy']
            master_row['construct name'] = construct['Construct Name']
            master_row['sequence'] = construct['Sequence']
            master_row['source'] = construct['Source']
            master_row['content'] = construct['Content']


            print('We are at construct number ' + str(construct['Construct']))
            print('Construct counter is at ' + str(construct_counter))
            is_next_construct = (int(construct['Construct']) > construct_counter)
            print('Are we at the next construct?')
            print(is_next_construct)

            if is_next_construct:
                part_number = 1
                construct_counter = int(construct['Construct'])
            print('Part number is now ' + str(part_number))

            for primer in primers_list:
                print(primer)


##                    print('Is primer ' + str(primer['name']) + ' associated with the construct?')
                is_associated_with_construct = bool(primer['construct number'] == construct['Construct'] and str(primer['part number']) == str(part_number))
##                    print(is_associated_with_construct)
                if(is_associated_with_construct == False):
                    break

                is_forward = bool(primer['construct number'] == construct['Construct'] and str(primer['part number']) == str(part_number) and primer['direction'] == 'fw primer')

                print('Primer ' + str(primer['name']) + ' is a forward primer?')
                print(is_forward)

                is_reverse = bool(primer['construct number'] == construct['Construct'] and str(primer['part number']) == str(part_number) and primer['direction'] == 're primer')

                print('Primer ' + str(primer['name']) + ' is a reverse primer?')
                print(is_reverse)

                if is_forward:
                    master_row['primer1'] = primer['name']
                    master_row['primer1 sequence'] = primer['primer sequence']
                    master_row['primer1 description'] = primer['notes']
                    master_row['primer1 length'] = primer['length']
##                        print(master_row)
                    continue

                elif is_reverse:
                    master_row['primer2'] = primer['name']
                    master_row['primer2 sequence'] = primer['primer sequence']
                    master_row['primer2 description'] = primer['notes']
                    master_row['primer2 length'] = primer['length']
##                        print(master_row)
                    part_number += 1
                    print('Part number now = ' + str(part_number) + '\n')
                    master_list.append(master_row)
                    break

DATA SUBSET (constructs) (exact sequences eliminated to keep within SO character limits):

{'Sequence': '', 'Construct': '12', 'Strategy': 'Gibson', 'Content': 'Amp resistance marker', 'Source': 'pEM096', 'Construct Name': 'T7 RNAP core on BAC ori only with AmpR'}
{'Sequence': '', 'Construct': '12', 'Strategy': 'Gibson', 'Content': 'BAC origin and T7 RNAP core', 'Source': 'THSS301', 'Construct Name': 'T7 RNAP core on BAC ori only with AmpR'}
{'Sequence': '', 'Construct': '13', 'Strategy': 'Cut Gibson', 'Content': 'lycopene pathway (crtE.B.I.dxs.idi)', 'Source': 'KT-537', 'Construct Name': 'Combined vio and lyc plasmid'}
{'Sequence': '', 'Construct': '13', 'Strategy': 'Cut Gibson', 'Content': 'vioABE pathway and pSC101 ori and CmR;  digest with EcoRI and XbaI', 'Source': 'KT-587', 'Construct Name': 'Combined vio and lyc plasmid'}
{'Sequence': '', 'Construct': '14', 'Strategy': 'Cut Gibson', 'Content': 'lycopene pathway (crtE.B.I.dxs.idi)', 'Source': 'KT-537', 'Construct Name': 'Combined vio and lyc plasmid, with lyc in reverse direction'}
{'Sequence': '', 'Construct': '14', 'Strategy': 'Cut Gibson', 'Content': 'vioABE pathway and pSC101 ori and CmR;  digest with EcoRI and XbaI', 'Source': 'KT-587', 'Construct Name': 'Combined vio and lyc plasmid, with lyc in reverse direction'}
{'Sequence': '', 'Construct': '15', 'Strategy': 'Gibson', 'Content': 'vioABE pathway with random nucleotide spacers', 'Source': 'KT-587', 'Construct Name': 'Combined vio and lyc plasmid made by high GC polymerase'}
{'Sequence': '', 'Construct': '15', 'Strategy': 'Gibson', 'Content': 'lycopene pathway (crtE.B.I.dxs.idi)', 'Source': 'KT-537', 'Construct Name': 'Combined vio and lyc plasmid made by high GC polymerase'}
{'Sequence': '', 'Construct': '15', 'Strategy': 'Gibson', 'Content': 'pSC101 origin of replication and CmR resistance marker', 'Source': 'KT-537', 'Construct Name': 'Combined vio and lyc plasmid made by high GC polymerase'}
{'Sequence': '', 'Construct': '16', 'Strategy': 'Gibson', 'Content': 'P(tac)-SynZip18-T7 fragment', 'Source': 'THSS303', 'Construct Name': 'P(tac)-T7 fragment controller'}
{'Sequence': '', 'Construct': '16', 'Strategy': 'Gibson', 'Content': 'IncW backbone and TpR resistance and lacIq', 'Source': 'pEM103', 'Construct Name': 'P(tac)-T7 fragment controller'}
{'Sequence': '', 'Construct': '17', 'Strategy': 'Gibson', 'Content': 'P(tac)-SynZip18-T3 fragment', 'Source': 'THSS304', 'Construct Name': 'P(tac)-T3 fragment controller'}
{'Sequence': '', 'Construct': '17', 'Strategy': 'Gibson', 'Content': 'IncW backbone and TpR resistance and lacIq', 'Source': 'pEM103', 'Construct Name': 'P(tac)-T3 fragment controller'}

DATA SUBSET (primers):

{'part number': '1', 'direction': 'fw primer', 'name': 'EMP790', 'primer sequence': 'gtttgtcggtgaactaattCttattaccaatgcttaatcagggaggcacctatctcagcg', 'notes': 'Fw Gibson primer on pEM096 to extract Amp resistance marker', 'length': '60', 'construct number': '12'}
{'part number': '1', 'direction': 're primer', 'name': 'EMP787', 'primer sequence': 'gatgaggatcgtttcgcatgctaaatacattcaaatatctatccgctcatgagacaataa', 'notes': 'Re Gibson primer on pEM096 to extract Amp resistance marker', 'length': '60', 'construct number': '12'}
{'part number': '2', 'direction': 'fw primer', 'name': 'EMP788', 'primer sequence': 'agatatttgaatgtatttagcatgcgaaacgatcctcatcctgtctcttgatcagatctt', 'notes': 'Fw Gibson primer on THSS301 to extract BAC and R6K origins and T7 RNAP core', 'length': '60', 'construct number': '12'}
{'part number': '2', 'direction': 're primer', 'name': 'EMP791', 'primer sequence': 'tgattaagcattggtaataaGaattagttcaccgacaaacaacagataaaacgaaaggcc', 'notes': 'Re Gibson primer on THSS301 to extract BAC origin and T7 RNAP core', 'length': '60', 'construct number': '12'}
{'part number': '1', 'direction': 'fw primer', 'name': 'EMP792', 'primer sequence': 'aaggaatattcagcaatttgGTTGGGGATAGCGCTAGCTATAATAactaTCACTATAGGG', 'notes': 'Fw Gibson primer on KT-587 to extract vioABE pathway with random nucleotide spacers', 'length': '60', 'construct number': '15'}
{'part number': '1', 'direction': 're primer', 'name': 'EMP793', 'primer sequence': 'gggcctttcttcggcacgggGTTGTAGCAGGCGTCTTTGTCAAAAAACCCCTCAAGACCC', 'notes': 'Re Gibson primer on KT-587 to extract vioABE pathway with random nucleotide spacers', 'length': '60', 'construct number': '15'}
{'part number': '2', 'direction': 'fw primer', 'name': 'EMP794', 'primer sequence': 'ACAAAGACGCCTGCTACAACcccgtgccgaagaaaggcccacccgtgaaggtgagccagt', 'notes': 'Fw Gibson primer on KT-537 to extract lycopene pathway (crtE.B.I.dxs.idi)', 'length': '60', 'construct number': '15'}
{'part number': '2', 'direction': 're primer', 'name': 'EMP795', 'primer sequence': 'gaggtcattactggatctaTcccgtgccgaagaaaggcccacccgtgaaggtgagccagt', 'notes': 'Re Gibson primer on KT-537 to extract lycopene pathway (crtE.B.I.dxs.idi)', 'length': '60', 'construct number': '15'}
{'part number': '3', 'direction': 'fw primer', 'name': 'EMP796', 'primer sequence': 'gggcctttcttcggcacgggAtagatccagtaatgacctcagaactccatctggatttgt', 'notes': 'Fw Gibson primer on KT-537 to extract pSC101 origin of replication and CmR resistance marker', 'length': '60', 'construct number': '15'}
{'part number': '3', 'direction': 're primer', 'name': 'EMP797', 'primer sequence': 'TAGCTAGCGCTATCCCCAACcaaattgctgaatattccttttcttagacgtcaggtggca', 'notes': 'Re Gibson primer on KT-537 to extract pSC101 origin of replication and CmR resistance marker', 'length': '60', 'construct number': '15'}
{'part number': '1', 'direction': 'fw primer', 'name': 'EMP798', 'primer sequence': 'aaatattctgaaatgagctgttgacaattaatcatcggctcgtataatgtgtggaattgt', 'notes': 'Fw Gibson primer on THSS303 to extract P(tac)-SynZip18-T7 fragment', 'length': '60', 'construct number': '16'}
{'part number': '1', 'direction': 're primer', 'name': 'EMP799', 'primer sequence': 'attaccgcctttgagtgagccccaatgataaccccaagggaagttttagtcaaaagcctc', 'notes': 'Re Gibson primer on THSS303 to extract P(tac)-SynZip18-T7 fragment', 'length': '60', 'construct number': '16'}
{'part number': '2', 'direction': 'fw primer', 'name': 'EMP800', 'primer sequence': 'cccttggggttatcattggggctcactcaaaggcggtaatcagataaaaaaaatccttag', 'notes': 'Fw Gibson primer on pEM103 to extract IncW backbone and TpR resistance and lacIq', 'length': '60', 'construct number': '16'}
{'part number': '2', 'direction': 're primer', 'name': 'EMP801', 'primer sequence': 'agccgatgattaattgtcaacagctcatttcagaatatttgccagaaccgttatgatgtc', 'notes': 'Re Gibson primer on pEM103 to extract IncW backbone and TpR resistance and lacIq', 'length': '60', 'construct number': '16'}
{'part number': '1', 'direction': 'fw primer', 'name': 'EMP798', 'primer sequence': 'aaatattctgaaatgagctgttgacaattaatcatcggctcgtataatgtgtggaattgt', 'notes': 'Fw Gibson primer on THSS303 to extract P(tac)-SynZip18-T7 fragment', 'length': '60', 'construct number': '17'}
{'part number': '1', 'direction': 're primer', 'name': 'EMP799', 'primer sequence': 'attaccgcctttgagtgagccccaatgataaccccaagggaagttttagtcaaaagcctc', 'notes': 'Re Gibson primer on THSS303 to extract P(tac)-SynZip18-T7 fragment', 'length': '60', 'construct number': '17'}
{'part number': '2', 'direction': 'fw primer', 'name': 'EMP800', 'primer sequence': 'cccttggggttatcattggggctcactcaaaggcggtaatcagataaaaaaaatccttag', 'notes': 'Fw Gibson primer on pEM103 to extract IncW backbone and TpR resistance and lacIq', 'length': '60', 'construct number': '17'}
{'part number': '2', 'direction': 're primer', 'name': 'EMP801', 'primer sequence': 'agccgatgattaattgtcaacagctcatttcagaatatttgccagaaccgttatgatgtc', 'notes': 'Re Gibson primer on pEM103 to extract IncW backbone and TpR resistance and lacIq', 'length': '60', 'construct number': '17'}

Upvotes: 0

Views: 166

Answers (1)

Lev Levitsky
Lev Levitsky

Reputation: 65791

The problem is that you are iterating over a csv.DictReader object, which is not a list, but rather an iterator.

The difference between the two is that, with the iterator, you cannot "go back to the beginning". Every step of the inner loop, your iteration over the primer_list starts where it left off the last time.

If you want to be able to iterate over all items multiple times and if you have sufficient memory, store them in a list:

primers_list = list(csv.DictReader(primers))

If you want to keep the memory usage low, you can create the DictReader object from scratch every time inside the loop. However, this will add some (probably minor) overhead in execution time, and you should take care of closing the file by moving the with statement into the loop.

Another way would be to do primers.seek(0) at the end of the loop body, so that it starts reading from the beginning of the file on the next iteration, but I'm not sure if it's a good hack.

Upvotes: 4

Related Questions