Digsby
Digsby

Reputation: 151

How do I get my inner for loop to iterate every time my outer for loop iterates?

I have two files, and I am trying to append the strings from the last column of the second file to an array within an array containing information in the first file. I want these strings to append only if the numbers in the second column of the second file fall between the numbers of the first and second columns of the first file.

Here are my files:

reads.bed:

chromA  10      69      read1
chromA  10      35      read2
chromA  10      55      read3
chromA  15      69      read4
chromA  80      119     read5
chromA  80      111     read6
chromA  90      119     read7
chromA  101     119     read8

feats.bed:

chromA  10      19      feat1
chromA  30      39      feat2
chromA  50      69      feat3
chromA  80      89      feat4
chromA  100     119     feat5

Here is my code:

feat_bed=open("feats.bed","r")
read_bed=open("reads.bed","r")


read_coords=[]
for line in read_bed.readlines():
    line=line.strip()
    line=line.split("\t")
    read_coords.append([int(line[1]),int(line[2]),str(line[3]),[]])


for read in read_coords:
    for feat in feat_bed.readlines():
        feat=feat.strip()
        feat=feat.split("\t")
        if int(read[1]) > int(feat[1]) >= int(read[0]):
            read[3].append(str(feat[3]))
    print read

My expected output would be:

[10, 69, 'read1', ['feat1', 'feat2', 'feat3']]
[10, 35, 'read2', ['feat1', 'feat2']]
[10, 55, 'read3', ['feat1', 'feat2', 'feat3']]
[15, 69, 'read4', ['feat2', 'feat3']]
[80, 119, 'read5', ['feat4', 'feat5']]
[80, 111, 'read6', ['feat4', 'feat5']]
[90, 119, 'read7', ['feat5']]
[101, 119, 'read8', []]

Instead, my inner for loop seems to iterate only the first time, and then it stops, so my actual output is:

[10, 69, 'read1', ['feat1', 'feat2', 'feat3']]
[10, 35, 'read2', []]
[10, 55, 'read3', []]
[15, 69, 'read4', []]
[80, 119, 'read5', []]
[80, 111, 'read6', []]
[90, 119, 'read7', []]
[101, 119, 'read8', []]

I don't understand why my inner loop stops iterating after the first iteration of my outer loop. If someone could point out what I'm doing wrong that would be super helpful. Thanks.

Upvotes: 1

Views: 74

Answers (2)

Tomerikoo
Tomerikoo

Reputation: 19414

This happens because readlines() reads all lines from the current position in the file. So after the first call to readlines, the file pointer is at the end of the file and all subsequent calls to readlines() will return an empty list.

You want to save the lines to a list beforehand, like feat_lines = feat_bed.readlines() and then iterate on that pre-saved list of lines like: for feat in feat_lines:.

Upvotes: 1

Raymond Reddington
Raymond Reddington

Reputation: 1837

Using inner loops with identation:

feat_bed=open("feats.bed","r")
read_bed=open("reads.bed","r")


read_coords=[]
for line in read_bed.readlines():
    line=line.strip()
    line=line.split("\t")
    read = [int(line[1]),int(line[2]),str(line[3]),[]]

    for feat in feat_bed.readlines():
        feat=feat.strip()
        feat=feat.split("\t")
        if int(read[1]) > int(feat[1]) >= int(read[0]):
            read[3].append(str(feat[3]))
    print read

Upvotes: 0

Related Questions