Reputation: 13
I'm trying out the GC Content problem over at Rosalind. I figured I could read the contents of the provided FASTA format file into a list. Use the list to make a dictionary with the Rosalind ID as the key and the ATGC string as value. In order to do that I read the file into a string variable, remove the newlines, and split at the '>'. My problem is the list that is returned has an empty entry at index 0.
with open('rosalind_gc.txt','r') as fileA:
contents = fileA.read()
contents = contents.replace('\n','')
listA = contents.split('>')
print(listA)
Upvotes: 0
Views: 77
Reputation: 4814
Does the constants
string start with >
?
If it does, constants.split(">", "")
will return ["", ...]
as there's a symbol at the start, which could explain why listA[0]
returns ''
.
Based on your comment, it seems that this is the case, and if you wish to remove all the ''
items, then simply add listA = [element for elements in contents if element]
.
Upvotes: 1
Reputation: 17322
if you want to eliminate the empty strings you could use:
listA = [e for e in contents.split('>') if e]
Upvotes: 0
Reputation:
if your text file begins with ">" it's normal. For example :
'>test'.split(">")
will return :
['', 'test']
Upvotes: 0