Ramesh
Ramesh

Reputation: 2337

Extract number from an URL regex java

I need to extract an last number from an URL followed by a dash.

Example:

http://www.example.com/p-test-test1-a-12345.html

i need to extract the 12345 using regex.

i tried this -\d(.*?).html which gives me 2345 not sure why it removes 1 any idea?

Upvotes: 0

Views: 892

Answers (4)

BambooleanLogic
BambooleanLogic

Reputation: 8161

You're looking for a dash, then a digit, then capturing all characters before ".html", which is why the 1 was not captured.

Try this instead:

-(\d+)\.html

Upvotes: 1

Nishant Lakhara
Nishant Lakhara

Reputation: 2445

Try This :

String pattern2 = ".*?(\\d+)\\.html";
System.out.println(s.replaceAll(pattern2, "$1"));

Upvotes: 1

Tafari
Tafari

Reputation: 3069

It removes the first digit as you have invalid pattern it captures everything after -digit

-\d(.*?).html

-\d - matches a dash followed by a digit

(.*?) - captures any character (except new line) 0 or more times till next token is satisifed

. - matches any character (except new line)

html - matches html


Try this pattern:

PATTERN

(?<=-)\d+(?=\.html)

Upvotes: 4

Michał Niklas
Michał Niklas

Reputation: 54312

You must add \d to group: -(\d.*?).html

if it must be only digits then -(\d+)\.html is better.

Upvotes: 2

Related Questions