Randomly replace N characters in a string with one letter

Question

So I have a DNA sequence file and my goal is to randomly replace 5 of the nucleotides in the sequence with the letter M.

ie. dna1.txt has the sequence ACTGGCTACATTG.

I want to make ACTGGCTACATTG look something like ACMMGCMMCATMG or something of the sort.

I know how to replace one letter at a time, but not several.

dna1 = open ("dna1.txt","r")
data1 = dna1.read()

from random import randint, choice

def Mutated_DNA(data1):
    dna_list = list(data1)
    mutation_site = randint(0, len(dna_list)-1)
    dna_list[mutation_site] = choice(list('M'))        
    return ''.join(dna_list) 

print (Mutated_DNA(data1))

What should I do?

DSM · Accepted Answer

If you want to replace exactly 5 characters with something new, then I think the simplest way is to sample from the possible positions and then change exactly those. For example:

from random import sample

def mutate(s, num, target):
    change_locs = set(sample(range(len(s)), num))
    changed = (target if i in change_locs else c for i,c in enumerate(s))
    return ''.join(changed)

e.g.

>>> mutate('ABC', 2, 'M')
'MMC'
>>> mutate('ABC', 2, 'M')
'AMM'
>>> mutate('ABC', 2, 'M')
'MMC'
>>> mutate('ABC', 2, 'M')
'MBM'

or

def mutate(s, num, target):
    change_locs = sample(range(len(s)), num)
    new_s = list(s)
    for change_loc in change_locs:
        new_s[change_loc] = target
    return ''.join(new_s)

etc.

Randomly replace N characters in a string with one letter

Answers (1)

Related Questions