Reputation: 85
I need to find short url in the text post in java. I have the following regex expression "(http://(bit\.ly|t\.co|lnkd\.in|tcrn\.ch).*?)\s"
I have 2 questions
The problem with the above expression is it doesn't match the short url if it is at the end of line. ex For text "blah http://linkd.in/R9Msf3 blah" gives "http://linkd.in/R9Msf3 "
But blah blah http://linkd.in/R9Msf3 does not gives "http://linkd.in/R9Msf3"
Any suggestions how to match both patterns ? Basically I just need to replace the short url out of the text.
Also is there a better way to get all the short url format? If I hard code it then everytime I would have to add a new format to the config.
Upvotes: 2
Views: 2818
Reputation: 14554
Instead of .*
use \S*
to avoid matching whitespace. You don't need the ?
and you can use \b
instead of \s
to match the boundary between the end of the url and whitespace or end of string.
(http://(bit\.ly|t\.co|lnkd\.in|tcrn\.ch)\S*)\b
Upvotes: 2