Reputation: 161
For instance, I have the following string:
Hello how are you today, [name]?
How would I go about randomly placing characters between a random choice of words but not [name]? I already I have this following piece of code but I was hoping there is a better way of going about it.
string = 'Hello how are you today, [name]?'
characters = 'qwertyuioplkjhgfdsazxcvbnm,. '
arr = string.rsplit(" ")
for i in range(0, len(arr)):
x = arr[i]
if x == '[name]':
continue
if (random.randint(0,2)==1) :
rnd=random.randint(1,len(x)-2)
tmp1 = random.randint(0,len(characters))
rndCharacter = characters[tmp1:tmp1+1]
x = x[0:rnd] + rndCharacter + x[rnd+1:]
arr[i] = x
" ".join(arr)
> Hellio how are yoy todsy, [name]?"
Though this replaces the character with a another random character. What way would I go about having it randomly replace or place a random character after or before a character as well?
Basically I'm just trying to simulate a sort of typo generator.
Thanks
Update on my code so far:
string = 'Hey how are you doing, [name]?'
characters = 'aeiou'
arr = string.rsplit(" ")
for i in range(0, len(arr)):
x = arr[i]
if x == '[name]': continue
if len(x) > 3:
if random.random() > 0.7:
rnd = random.randint(0,len(x)-1)
rndCharacter = random.choice(characters)
if random.random() > 0.7:
x = x[0:rnd] + rndCharacter + x[rnd+1:]
else:
x = x[:rnd] + rndCharacter + x[rnd:]
arr[i] = x
else:
if random.random() > 0.7:
rnd = random.randint(0,len(x)-1)
rndCharacter = random.choice(characters)
x = x[:rnd] + rndCharacter + x[rnd:]
arr[i] = x
print " ".join(arr)
> Hey houw are you doiang, [name]?
UPDATE:
Maybe my final update for the code, hopefully this will help someone out some point in the future
def misspeller(word):
typos = { 'a': 'aqwedcxzs',
'b': 'bgfv nh',
'c': 'cdx vf',
'd': 'desxcfr',
'e': 'e3wsdfr4',
'f': 'fredcvgt',
'g': 'gtrfvbhyt',
'h': 'hytgbnju',
'i': 'i8ujko9',
'j': 'juyhnmki',
'k': 'kiujm,lo',
'l': 'loik,.;p',
'm': 'mkjn ,',
'n': 'nhb mjh',
'o': 'o9ikl;p0',
'p': 'p0ol;[-',
'q': 'q1asw2',
'r': 'r4edft5',
's': 'swazxde',
't': 't5rfgy6',
'u': 'u7yhji8',
'v': 'vfc bg',
'w': 'w2qasde3',
'x': 'xszcd',
'y': 'y6tghu7',
'z': 'zaZxs',
' ': ' bvcnm',
'"': '"{:?}',
'\'': '[;/\']',
':': ':PL>?"{',
'<': '<LKM >',
'>': '>:L<?:',
';': ';pl,.;[',
'[': '[-p;\']=',
']': '=[\'',
'{': '{[_P:"}+',
'}': '}=[\']=',
'|': '|\]\'',
'.': '.l,/;',
',': ',lkm.'
}
index = random.randint(1,len(word)-1)
letter = list(word[:index])[-1].lower()
try:
if random.random() <= 0.5:
return word[:index] + random.choice(list(typos[letter])) + word[index:]
else:
return word[:index-1] + random.choice(list(typos[letter])) + word[index:]
except KeyError:
return word
def generate(self, s, n, safe_name):
misspelled_s = ''
misspelled_list = []
for item in s.split(' '):
if n:
if safe_name in item:
misspelled_list.append(item)
else:
r = random.randint(0,1)
if r == 1 and len(re.sub('[^A-Za-z0-9]+', '', item)) > 3:
misspelled_list.append(misspeller(item))
n -= 1
else:
misspelled_list.append(item)
else:
misspelled_list.append(item)
return ' '.join(misspelled_list)
Upvotes: 2
Views: 2310
Reputation: 11395
You are "hoping there is a better way of going about it". Well, here's some suggestions, and some code demonstrating those suggestions. Some of the suggestions are about making the code more pythonic or easy to read, not just the mechanics of changing strings.
for x in string.rsplit(" ")
is a more pythonic way to loop through the words.x[:rnd] + ... + x[rnd:x]
as recommended by others, for easier string manipulation.x if condition else y
for a concise choice between alternatives, in this case between an index which causes overwriting and an index which causes insertion.len(x) > 3
. I follow your code, but this is easy to change.Hope this helps.
import random
import re
string = 'Hello how are you today, [name]?'
characters = 'qwertyuioplkjhgfdsazxcvbnm,. '
words = []
for x in string.rsplit(" "):
if None == re.search('[^\]]*\[[a-z]+\].*', x) \
and len(x) > 3 and random.random()<=0.5:
# rnd: index of char to overwrite or insert before
rnd = random.randint(2,len(x)-2)
# rnd1: index of 1st char after modification
# random.random <= 0.x is probability of overwriting instead of inserting
rnd1 = rnd + 1 if random.random() <= 0.5 else 0
x = x[:rnd] + random.choice(characters) + x[rnd1:]
words.append(x)
typos = " ".join(words)
print typos
Update: fixed indentation mistake in the code.
Update 2: made the code for choosing overwrite versus insert to be more concise.
Upvotes: 0
Reputation: 2109
import random
def misspeller(word):
characters = 'qwertyuioplkjhgfdsazxcvbnm,. '
rand_word_position = random.randint(-1,len(word))
rand_characters_position = random.randint(0,len(characters)-1)
if rand_word_position == -1:
misspelled_word = characters[rand_characters_position] + word
elif rand_word_position == len(word):
misspelled_word = word + characters[rand_characters_position]
else:
misspelled_word = list(word)
misspelled_word[rand_word_position] = characters[rand_characters_position]
misspelled_word = ''.join(misspelled_word)
return misspelled_word
s = 'Hello how are you today, [name]?'
misspelled_s = ''
misspelled_list = []
for item in s.split(' '):
if '[name]' in item:
misspelled_list.append(item)
else:
misspelled_list.append(misspeller(item))
misspelled_s = ' '.join(misspelled_list)
print misspelled_s
Examples of what I'm getting from misspelled_s
are:
'Hellk howg ars youf poday, [name]?'
'Heylo how arer y,u todab, [name]?'
'Hrllo hfw are zyou totay, [name]?'
Edited to clean up a couple of mistakes and omissions on first copy.
Edit 2 If you don't want every word to be affected you can modify the for loop in the following way:
for item in s.split(' '):
n = random.randint(0,1)
if '[name]' in item:
misspelled_list.append(item)
elif n == 1:
misspelled_list.append(misspeller(item))
else:
misspelled_list.append(item)
You can modify the probability that a word is modified by changing how n
is generated e.g. n = random.randint(0,10)
Upvotes: 2
Reputation: 29302
You can also use split('[name]')
, and work on the sub-string, this way you'll be sure (see note below) of don't change '[name]'
.
You may have problem splitting on every [name]
occurance catching some substring of some longer name, but if you:
Then the following code should work fine:
def typo(string):
index = random.randint(1,len(string)-1) # don't change first or last
return string[:index] + random.choice(characters) + string[index:]
def generate(string, n, safe_name):
sub_strings = string.split(safe_name)
while n:
sub_index = random.randint(0,len(sub_strings) - 1)
sub = sub_strings[sub_index]
if len(sub) <= 2: # if too short don't change
continue
sub_strings[sub_index] = typo(sub)
n -= 1
return safe_name.join(sub_strings)
Example adding 3 new random charachter:
>>> string = 'Hello how are you today, Alice?'
>>> generate(string, 3, 'Alice')
'Hellov howj are yoiu today, Alice?'
With the name occurring more then one time:
>>> string = 'Hello Alice, how are you today, Alice?'
>>> generate(string, 3, 'Alice')
'Hello Alice, hoiw arfe you todayq, Alice?'
Upvotes: 1
Reputation: 86240
I think @sgallen's answer will work, but I have a few tips (for your previous code, and going forward).
for i in range(0, len(arr)):
x = arr[i]
# is the same as
for i,x in enumerate(arr):
else:
if random...:
# to
elif random...:
Using string
as the name of a variable, isn't a good practice. The reason being, there is a string
module. It might even come in handy for this because of the string constants. Alternatives could be inp
, or data
, or sentence
.
# For example
>>> import string
>>> string.lowercase
'abcdefghijklmnopqrstuvwxyz'
By the way, if anyone notices errors in the above, leave a comment. Thanks.
Upvotes: 1
Reputation: 66
For the example you have given, it looks like we can split it on the comma and put the typo(s) in the first part of the string.
If this is correct, you need to do three things randomly before generating the typo:
Would this fit the bill?
(BTW, since you are familiar with random, I didn't give any code.)
Upvotes: 0
Reputation: 47790
If you want to place a letter before or after instead of replacing, just fix the indices in your splicing so that they don't jump over a letter - i.e. use
x = x[:rnd] + rndCharacter + x[rnd:]
That way the new character will be inserted in the middle, instead of replacing an existing one.
Also, you can use rndCharacter = random.choice(characters)
instead of using tmp1
like that.
Upvotes: 2