Reputation: 478
I have a pandas table with three set of columns, which would look something like this
data = {
'col1':['ABCD', 'EFGH', 'IJKL', 'MNOP1', 'QRST25'],
'col2':['ABCD,1234', 'EFGH,5678', 'IJKL,91011,F1H2I3', 'MNOP,121314', 'MNOP,151617,A1B2C3'],
'col3':['ABDC,EFGH', 'IJAL,MNIP', 'QURST,UVWY', 'C4GH,MQUR', 'FHGH,QRQP']}
I am looking to mask few character in certain pattern as below
output.data = {
'col1':['*BCD', '*FGH', '*JKL', '*NOP1', '*RST25'],
'col2':['*BCD,*234', '*FGH,*678', '*JKL,*1011,*1H2I3', '*NOP,*21314', '*NOP,*51617,*1B2C3'],
'col3':['****,EFGH', '****,MNIP', '****,UVWY', '****,MQUR', '****,QRQP']}
that is, in column one, I want the first character replaced with * , in column 2 first character of every word separated by comma replaced with * and lastly in the third column all the character in first set of strings that are separated by comma
for the first column, I could think off something like this, but that may not the right way to do it
data["col1"].str.replace([:1], "*")
for rest two columns I have not idea how to do it., help requested.
Upvotes: 0
Views: 165
Reputation: 688
You can try with:
df['col1'] = df['col1'].apply(lambda x: x.replace(x[0], "*"))
df['col2'] = df['col2'].apply(lambda x: x.replace(x[0], "*"))
df['col2'] = df['col2'].apply(lambda x: x.replace(x[4], "*"))
df['col3'] = df['col3'].apply(lambda x: x.replace(x[:4], "****"))
maybe not very good looking but it works ;D
Upvotes: 0
Reputation: 9857
This works but it might not be the most efficient method.
import pandas as pd
data = {
'col1':['ABCD', 'EFGH', 'IJKL', 'MNOP1', 'QRST25'],
'col2':['ABCD,1234', 'EFGH,5678', 'IJKL,91011,F1H2I3', 'MNOP,121314', 'MNOP,151617,A1B2C3'],
'col3':['ABDC,EFGH', 'IJAL,MNIP', 'QURST,UVWY', 'C4GH,MQUR', 'FHGH,QRQP']}
df = pd.DataFrame(data)
df['col1'] = df['col1'].apply(lambda s: '*'+s[1:])
df['col2'] = df['col2'].apply(lambda s: ','.join(['*'+t[1:] for t in s.split(',')]))
df['col3'] = df['col3'].apply(lambda s: s.replace(s.split(',')[0], (s.find(',')+1)*'*' ))
print(df)
Upvotes: 2