Reputation: 81
I have a data frame that looks like this:
id1 | id2
----------------------------
ab51c-ee-1a | cga--=%abd21
I am looking to randomize the letters only:
id1 | id2
----------------------------
ge51r-eq-1b | olp--=%cqw21
I think I can do something like this:
newid1 = []
for index, row in df.iterrows():
string = ''
for i in row['id1']:
if i.isalpha():
string+=random.choice(string.letters)
else:
string+=i
newcolumn.append(string)
But it doesn't seem very efficient. Is there a better way?
Upvotes: 2
Views: 1264
Reputation: 30605
Lets use apply
, with the power of str.replace
to replace only alphabets using regex i.e
import string
import random
letters = list(string.ascii_lowercase)
def rand(stri):
return random.choice(letters)
df.apply(lambda x : x.str.replace('[a-z]',rand))
Output :
id1 id2 0 gp51e-id-1v jvj--=%glw21
For one specific column use
df['id1'].str.replace('[a-z]',rand)
Added by @antonvbr: For future reference, if we want to change upper and lower case we could do this:
letters = dict(u=list(string.ascii_uppercase),l=list(string.ascii_lowercase))
(df['id1'].str.replace('[a-z]',lambda x: random.choice(letters['l']))
.str.replace('[A-Z]',lambda x: random.choice(letters['u'])))
Upvotes: 4
Reputation: 27899
How about this:
import pandas as pd
from string import ascii_lowercase as al
import random
df = pd.DataFrame({'id1': ['ab51c-ee-1a'],
'id2': ['cga--=%abd21']})
al = list(al)
df = df.applymap(lambda x: ''.join([random.choice(al) if i in al else i for i in list(x)]))
Upvotes: 1