Reputation: 16055
I've been struggling with this problem for quite some time now and I just can't seem to find a solution. I have the following regular expression for matching URLs which appears to work flawlessly until I post a bunch of links on new lines without spaces between them.
(http|ftp)+(s)?:(\/\/)((\w|\.|\-)+)(\/)?(\S)+
I tried this in a couple of regex testers and it seems to pick URLs correctly, unlike the code at my application. Which made me think there must be something wrong with the code and I started debugging. What I found out when I echo
'ed the string I'm applying the regular expression to is this:
http://www.google.com/\r\nhttp://www.google.com/\r\nhttp://www.google.com/
I have never seen new lines \r\n
appear as text in the browser. This makes me think that there's something else getting its hands on this string. I followed my logic and it turned out that this string comes right from a textarea
element into $_POST
and is not being manipulated anywhere.
What may be causing those \r\n
s to appear as text and how would I go about matching those URLs that users may input separated by new lines?
I'm kind of really desperate over here, I would really appreciate your help guys.
Upvotes: 0
Views: 162
Reputation: 10151
If you are seeing
http://www.google.com/\r\nhttp://www.google.com/\r\nhttp://www.google.com/
when you echo the string, that means that the actual string you are echoing is:
http://www.google.com/\\r\\nhttp://www.google.com/\\r\\nhttp://www.google.com/
i.e. the backslashes have been escaped, causing them to not be treated as newline characters. This means that you are only getting a single match in your regex.
Check out this question: Why are $_POST variables getting escaped in PHP? for reasons why your requests may be getting escaped.
Upvotes: 2