SLglider
SLglider

Reputation: 267

Filter Pandas dataframe row and replace value in column

I got a big list of phone numbers in all sorts of formats:

df = pd.DataFrame(
    {'phone': ['0123/12345', '0123-23456', '0123/4455-10', '0123-4455-22'],
     'name': ['A-1', 'B-1', 'C/3', 'D/7']})

  name phone
0 A-1  0123/12345
1 B-1  0123-23456
2 C/3  0123/4455-10
3 D/7  0123-4455-22

The formats I want are in rows #0 and #2.

When I concentrate on #1, I tried the following:

df.loc[(df.phone.str.count('-')==1) &
       (df.phone.str.count('/')==0)].apply(lambda x: x.str.replace('-', '/'))

And this does the trick on the number, but unfortunately also on the name column:

  name phone
1 B/1  0123/23456 

But the name column must not be changed.

So I have two questions:

  1. How can I filter the row and only change the phone column?
  2. How can I go with #3, where I would want to replace the first occurrence of '-' to '/'?

Upvotes: 0

Views: 6474

Answers (2)

akuiper
akuiper

Reputation: 215047

You can use regex replace (str.replace method) on column phone only:

df['phone'] = df.phone.str.replace("^(\d+)-(.*)$", r"\1/\2")
df
#  name        phone
#0  A-1   0123/12345
#1  B-1   0123/23456
#2  C/3 0123/4455-10
#3  D/7 0123/4455-22

Explanation on the regex:

^(\d+)-(.*)$ matches a string that starts with digits and immediately followed by dash which is the case for row #0 and row #2; With back reference, it replaces the first dash with / and for row #1 and row #3, since they don't match the regex, no modification is applied.

Upvotes: 7

Rayhane Mama
Rayhane Mama

Reputation: 2424

Or if you're no fan of regex (like me), you can simply do this:

df['phone'] = df.phone.apply(lambda x: x.replace('-','/',1) if '/' not in x else x)
print(df)

  name         phone
0  A-1    0123/12345
1  B-1    0123/23456
2  C/3  0123/4455-10
3  D/7  0123/4455-22

Probably not the best or fastest way, still I feel more comfortable with it since I don't know regex yet.

Hope that was helpful.

Upvotes: 1

Related Questions