Reputation: 33
I am trying to convert plain text into links, hashtags and @tags. I have managed to partially do this, but can't find any way of differentiating between a hashtag and a link containing a hash.
I'm new to using regex so it may be a bit messy!
//link
$message = preg_replace('/((http(s)?)(\:\/\/)|(www\.))([a-zA-Z0-9_\-\.\/\&\%\?\=\+\#\:\;\~\[\]\!\,\@\$\'\(\)\*]+)/', '<a href="http$3://$5$6">$0</a>', $message );
//handle
$message = preg_replace('/[@]+([A-Za-z0-9-_]+)/', '<a href="#$1">$1</a>', $message );
//hashtag
$message = preg_replace('/[#]+([A-Za-z0-9-_]+)/', '<a href="#$1">$1</a>', $message );
The plain text converts to a link as desired, and then breaks at the point of the hash.
Desired text:
www.hello.com/about_us/test%20page/test-page.php#header?this=12345&that=YES
Actual text:
header?this=12345&that=YES">www.hello.com/about_us/test%20page/test-page.php#header?this=12345&that=YES
Is there any way of checking if the hash is part of a URL before converting it to a hashtag?
Upvotes: 2
Views: 296
Reputation: 33
A solution that worked for me:
$message = preg_replace('/^(?<!http)^(?<!www\.)[#]+([A-Za-z0-9-_]+)/', '<a href="#$1">$1</a>', $message );//#hashtag
Upvotes: 1
Reputation: 766
Your regex for hashtag is this:
/[#]+([A-Za-z0-9-_]+)/
Your stated goal is to make sure it's not part of a URL, which you identify by:
/https?\:\/\//
You can try to use a negative look-behind:
/(?<!https?\:\/\/)[^#]*[#]+([A-Za-z0-9-_]+)
This is not enough for all general cases, but it sounds like you're trying to solve a problem within a scope under your control (a text file you own or something) so hopefully this well help you.
Upvotes: 2