Reputation: 2024
Pandas column contains a series of urls
. I'd like to extract a substring from the url.
MRE code below.
s = pd.Series(['https://url-location/img/xxxyyy_image1.png'])
s.apply(lambda x: x[x.find("/")+1:st.find("_")])
I'd like to extract xxxyyy
and store them into a new column.
Upvotes: 2
Views: 576
Reputation: 9197
Also possible:
s.str.split('/').str[-1].str.split('_').str[0]
# Out[224]: xxxyyy
This works, because .str
allows for the slice annotation.
So .str[-1]
will provide the last element after the split for example.
Upvotes: 1
Reputation: 626932
You can use
>>> s.str.extract(r'.*/([^_]+)')
0
0 xxxyyy
See the regex demo. Details:
.*
- zero or more chars other than line break chars as many as possible/
- a slash([^_]+)
- Capturing group 1 (the value captured into this group will be the actual return value of Series.str.extract
): one or more chars other than _
char.Upvotes: 3