Pandas: Randomize letters in a column

Question

I have a data frame that looks like this:

id1           | id2
----------------------------
ab51c-ee-1a   | cga--=%abd21

I am looking to randomize the letters only:

id1           | id2
----------------------------
ge51r-eq-1b   | olp--=%cqw21

I think I can do something like this:

newid1 = []
for index, row in df.iterrows():
    string = ''
    for i in row['id1']:
        if i.isalpha():
            string+=random.choice(string.letters)
        else:
            string+=i
    newcolumn.append(string)

But it doesn't seem very efficient. Is there a better way?

Bharath M Shetty · Accepted Answer

Lets use apply, with the power of str.replace to replace only alphabets using regex i.e

import string 
import random

letters = list(string.ascii_lowercase)

def rand(stri):
    return random.choice(letters)

df.apply(lambda x : x.str.replace('[a-z]',rand))

Output :

           id1            id2
0  gp51e-id-1v      jvj--=%glw21

For one specific column use

df['id1'].str.replace('[a-z]',rand)

Added by @antonvbr: For future reference, if we want to change upper and lower case we could do this:

letters = dict(u=list(string.ascii_uppercase),l=list(string.ascii_lowercase))

(df['id1'].str.replace('[a-z]',lambda x: random.choice(letters['l']))
          .str.replace('[A-Z]',lambda x: random.choice(letters['u'])))

Pandas: Randomize letters in a column

Answers (2)

Related Questions