Reputation: 63
I'm just starting to learn python in a bioinformatics research lab. My first project was to generate a program that can spit out various DNA sequences with parameters of length and number of copies. The sequences would then need to be output in FASTA format.
For those unfamiliar a DNA sequence can be made up of four "letters": A,G,C,T. Example DNA sequence: ACGTTCCGTACGTACTCT
I am really new to this all and I would like some advice on how to go about this and how to learn python in general (rely on tutorials, do random projects, etc).
I am currently using someone else's program for my DNA sequence project and then I will go through line by line to see what's being done.
The first error I encountered when copying over the code was this:
>>> import random
>>> import sys
>>> def simulate_sequence (length) :
dna = ['A','G','C','T']
sequence = ''
for i in range (length) :
sequence += random.choice (dna)
return sequence
>>> setsize = int (sys.argv[1])
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
setsize = int (sys.argv[1])
IndexError: list index out of range
>>>
Thank you.
Upvotes: 1
Views: 2209
Reputation: 4233
dna = ['A','G','C','T']
def generateDNA(N):
result= [random.choice(dna) for i in np.arange(N)]
return("".join(result))
print(generateDNA(100))
output:
TTCGCGGACGGTTCATCAGCCCTAGCCGGTTAAGAACTATCGAGCCACCCTAAGAACGGTCCATATTTGGAGTGTTACAACTTTGGATCTTCTACGTTGC
Upvotes: 0
Reputation: 40982
I use biopython for this:
def random_seq(N=180):
return Seq("".join(random.choice("ATCG") for _ in range(N)))
Upvotes: 1
Reputation: 32429
sys.argv
is a list of the arguments passed to your program.
For instance this program (called amt.py
):
import sys
print (sys.argv)
will behave like this:
$ ./amt.py
['./amt.py']
$ ./amt.py 1
['./amt.py', '1']
$ ./amt.py 1 abc
['./amt.py', '1', 'abc']
$ ./amt.py 1 abc 33
['./amt.py', '1', 'abc', '33']
The problem with your code is that it is expecting sys.argv
to have an item at index 1, but you haven't given it any command line arguments. So it tries to go to a non-existent location in the list.
Upvotes: 0
Reputation: 59
First of all, I would recommend this book.
The error comes from the fact that this program was made to be run from the command line, not for the interpreter. sys.argv[1]
gets the 1st command line argument (well, technically the 2nd because the first is the name of the program). In the interpreter you cannot specify arguments. Just paste it into a text editor and run it from the command line like: DNA.py 100
Upvotes: 0