dnic2693
dnic2693

Reputation: 63

Building a DNA sequence generator

I'm just starting to learn python in a bioinformatics research lab. My first project was to generate a program that can spit out various DNA sequences with parameters of length and number of copies. The sequences would then need to be output in FASTA format.

For those unfamiliar a DNA sequence can be made up of four "letters": A,G,C,T. Example DNA sequence: ACGTTCCGTACGTACTCT

I am really new to this all and I would like some advice on how to go about this and how to learn python in general (rely on tutorials, do random projects, etc).

I am currently using someone else's program for my DNA sequence project and then I will go through line by line to see what's being done.

The first error I encountered when copying over the code was this:

    >>> import random
    >>> import sys
    >>> def simulate_sequence (length) :
        dna = ['A','G','C','T']
        sequence = ''
        for i in range (length) :
            sequence += random.choice (dna)
        return sequence

    >>> setsize = int (sys.argv[1])
    Traceback (most recent call last):
      File "<pyshell#10>", line 1, in <module>
        setsize = int (sys.argv[1])
    IndexError: list index out of range
    >>> 

Thank you.

Upvotes: 1

Views: 2209

Answers (4)

dna = ['A','G','C','T']
def generateDNA(N):
   result= [random.choice(dna) for i in np.arange(N)]
   return("".join(result))

print(generateDNA(100))

output:

TTCGCGGACGGTTCATCAGCCCTAGCCGGTTAAGAACTATCGAGCCACCCTAAGAACGGTCCATATTTGGAGTGTTACAACTTTGGATCTTCTACGTTGC

Upvotes: 0

0x90
0x90

Reputation: 40982

I use biopython for this:

def random_seq(N=180):
    return Seq("".join(random.choice("ATCG") for _ in range(N)))

Upvotes: 1

Hyperboreus
Hyperboreus

Reputation: 32429

sys.argv is a list of the arguments passed to your program.

For instance this program (called amt.py):

import sys
print (sys.argv)

will behave like this:

$ ./amt.py
['./amt.py']
$ ./amt.py 1
['./amt.py', '1']
$ ./amt.py 1 abc
['./amt.py', '1', 'abc']
$ ./amt.py 1 abc 33
['./amt.py', '1', 'abc', '33']

The problem with your code is that it is expecting sys.argv to have an item at index 1, but you haven't given it any command line arguments. So it tries to go to a non-existent location in the list.

Upvotes: 0

user3184376
user3184376

Reputation: 59

First of all, I would recommend this book.

The error comes from the fact that this program was made to be run from the command line, not for the interpreter. sys.argv[1] gets the 1st command line argument (well, technically the 2nd because the first is the name of the program). In the interpreter you cannot specify arguments. Just paste it into a text editor and run it from the command line like: DNA.py 100

Upvotes: 0

Related Questions