0xsegfault
0xsegfault

Reputation: 3161

Searching for a substring in a dataframe and replacing it

I have a condition where spurious data is created and I am trying to clean it.

eg...

[email protected]/!ut/5 #RealLink
[email protected]/ut1/5_RTFDEERERTGFEFD # System adds junks to it
[email protected]/ut1/5_dvkerfddfrejermsdkasmf # System adds junks to it

I am trying to clean this up by dropping everything after !ut

So far I have tried :

SPA_MX = Mexico['Page URL'].str.startswith("http://[email protected]/ut1")

but this returns a boolean.

I would like advise on the most efficient way to achieve this.

Upvotes: 1

Views: 38

Answers (2)

EdChum
EdChum

Reputation: 394071

You can do this using apply on the column and then use find to return the index of the pattern and slice the str if found:

In[69]:

df['url'].apply(lambda x: x[:x.find('!ut') + 3] if x.find('!ut') != -1 else x)

Out[69]: 
0                             [email protected]/!ut
1           [email protected]/ut1/5_RTFDEERERTGFEFD
2    [email protected]/ut1/5_dvkerfddfrejermsdkasmf
Name: url, dtype: object

Upvotes: 1

Tejas Thakar
Tejas Thakar

Reputation: 583

my_string="[email protected]/!ut/5"
final =  my_string.split("!ut")[0]

output:

[email protected]/

Upvotes: 1

Related Questions