php_nub_qq
php_nub_qq

Reputation: 16055

Nasty regex and strange string behavior

I've been struggling with this problem for quite some time now and I just can't seem to find a solution. I have the following regular expression for matching URLs which appears to work flawlessly until I post a bunch of links on new lines without spaces between them.

(http|ftp)+(s)?:(\/\/)((\w|\.|\-)+)(\/)?(\S)+

I tried this in a couple of regex testers and it seems to pick URLs correctly, unlike the code at my application. Which made me think there must be something wrong with the code and I started debugging. What I found out when I echo'ed the string I'm applying the regular expression to is this:

http://www.google.com/\r\nhttp://www.google.com/\r\nhttp://www.google.com/

I have never seen new lines \r\n appear as text in the browser. This makes me think that there's something else getting its hands on this string. I followed my logic and it turned out that this string comes right from a textarea element into $_POST and is not being manipulated anywhere.

What may be causing those \r\ns to appear as text and how would I go about matching those URLs that users may input separated by new lines?

I'm kind of really desperate over here, I would really appreciate your help guys.

Upvotes: 0

Views: 162

Answers (1)

DaveJohnston
DaveJohnston

Reputation: 10151

If you are seeing

http://www.google.com/\r\nhttp://www.google.com/\r\nhttp://www.google.com/

when you echo the string, that means that the actual string you are echoing is:

http://www.google.com/\\r\\nhttp://www.google.com/\\r\\nhttp://www.google.com/

i.e. the backslashes have been escaped, causing them to not be treated as newline characters. This means that you are only getting a single match in your regex.

Check out this question: Why are $_POST variables getting escaped in PHP? for reasons why your requests may be getting escaped.

Upvotes: 2

Related Questions