Claude
Claude

Reputation: 391

Extracting string after pattern

I have a series of url

www.domain.com/calendar.php?month=may.2019
www.domain.com/calendar.php?month=april.2019
www.domain.com/calendar.php?month=march.2019
www.domain.com/calendar.php?month=feb.2019
...
...
...
www.domain.com/calendar.php?month=feb.2007

I wanted to extract the year after month.

What I'm looking for

2019
2019
...
...
2007

and save them into another columns

Here's what I have:

data["urls"].str.extract('(?<=month=).*$')

Upvotes: 1

Views: 48

Answers (2)

Emma
Emma

Reputation: 27723

Here, we can also use simple expression without look-arounds, such as:

.+month=.+\.([0-9]{4})

or:

month=.+\.([0-9]{4})

Demo 1

or:

.+month=.+\.(.+)

or:

month=.+\.(.+)

Demo 2

Upvotes: 0

piRSquared
piRSquared

Reputation: 294218

Fix your code

df["urls"].str.extract('(?<=month=).*\.(\d{4})$')

If you can trust that all do have the same pattern, then these should work.

split

df["urls"].str.rsplit('.', 1).str[-1]

slice

df["urls"].str[-4:]

Upvotes: 4

Related Questions