Vikram Karthic
Vikram Karthic

Reputation: 478

Replacing few characters of string in a Pandas Columns with asterisk

I have a pandas table with three set of columns, which would look something like this

data = { 
    'col1':['ABCD', 'EFGH', 'IJKL', 'MNOP1', 'QRST25'],  
    'col2':['ABCD,1234', 'EFGH,5678', 'IJKL,91011,F1H2I3', 'MNOP,121314', 'MNOP,151617,A1B2C3'],  
    'col3':['ABDC,EFGH', 'IJAL,MNIP', 'QURST,UVWY', 'C4GH,MQUR', 'FHGH,QRQP']}

I am looking to mask few character in certain pattern as below

output.data = { 
    'col1':['*BCD', '*FGH', '*JKL', '*NOP1', '*RST25'],  
    'col2':['*BCD,*234', '*FGH,*678', '*JKL,*1011,*1H2I3', '*NOP,*21314', '*NOP,*51617,*1B2C3'],  
    'col3':['****,EFGH', '****,MNIP', '****,UVWY', '****,MQUR', '****,QRQP']}

that is, in column one, I want the first character replaced with * , in column 2 first character of every word separated by comma replaced with * and lastly in the third column all the character in first set of strings that are separated by comma

for the first column, I could think off something like this, but that may not the right way to do it

data["col1"].str.replace([:1], "*")

for rest two columns I have not idea how to do it., help requested.

Upvotes: 0

Views: 165

Answers (2)

Giovanni Frison
Giovanni Frison

Reputation: 688

You can try with:

df['col1'] = df['col1'].apply(lambda x: x.replace(x[0], "*"))
df['col2'] = df['col2'].apply(lambda x: x.replace(x[0], "*"))
df['col2'] = df['col2'].apply(lambda x: x.replace(x[4], "*"))
df['col3'] = df['col3'].apply(lambda x: x.replace(x[:4], "****"))

maybe not very good looking but it works ;D

Upvotes: 0

norie
norie

Reputation: 9857

This works but it might not be the most efficient method.

import pandas as pd

data = { 
    'col1':['ABCD', 'EFGH', 'IJKL', 'MNOP1', 'QRST25'],  
    'col2':['ABCD,1234', 'EFGH,5678', 'IJKL,91011,F1H2I3', 'MNOP,121314', 'MNOP,151617,A1B2C3'],  
    'col3':['ABDC,EFGH', 'IJAL,MNIP', 'QURST,UVWY', 'C4GH,MQUR', 'FHGH,QRQP']}

df = pd.DataFrame(data)

df['col1'] = df['col1'].apply(lambda s: '*'+s[1:])

df['col2'] = df['col2'].apply(lambda s: ','.join(['*'+t[1:] for t in s.split(',')]))

df['col3'] = df['col3'].apply(lambda s: s.replace(s.split(',')[0], (s.find(',')+1)*'*' ))

print(df)

Upvotes: 2

Related Questions