Erik Allen
Erik Allen

Reputation: 1883

Regular Expression HTTP header for cookies while not going over an end of line

I was following the guide on Stack Overflow question on how to get the cookies from a php curl into a variable. The generally accepted answer uses a regular expression to get out all the cookies in all the lines of the header.

preg_match_all('/^Set-Cookie:\s*([^;]*)/mi', $result, $matches);

Since there can be more than one Set-Cookie header, this will match any and all of them.

However, I’ve found that it is assuming that every cookie ends with a semi-colon. I’ve found no evidence that that is a requirement. Indeed, the webservice I'm using only returns one cookie, with no ending semicolon. So, when I get these headers back:

 HTTP/1.1 200 OK
 Content-Length: 27
 Content-Type: application/json; charset=utf-8
 Server: Microsoft-IIS/7.5
 Access-Control-Allow-Origin: http://localhost
 Set-Cookie: sessionToken=22A2...DB87
 X-Powered-By: ASP.NET
 Date: Tue, 16 Feb 2016 16:28:12 GMT

And I use the parsing code to look at the cookie sessionToken, I get this value:

22A2...DB87
X-Powered-By: ASP.NET
Date: Tue, 16 Feb 2016 16:28:12 GMT

It is basically taking the rest of the headers as part of the cookie. This is not what I am looking for.

I’m not that proud of my regular expression skills. And the changes I have tried to make to it have not worked. When I tried to add a $ in the bracketed part it didn't help. If it was at the end, it didn’t match anything.

What should my regular expression look like to prevent it from going past the EOL?


For completeness, here is the php I have been using:

preg_match_all('/^Set-Cookie:\s*([^;]*)/mi', $header, $matches);

$cookies = array();
foreach($matches[1] as $item)
{
    parse_str($item, $cookie);
    $cookies = array_merge($cookies, $cookie);
}

$sessionToken = $cookies["sessionToken"];

Upvotes: 1

Views: 2090

Answers (1)

anubhava
anubhava

Reputation: 785058

You should add newline in your negative character class:

/^Set-Cookie:\s*([^;\r\n]*)/mi

([^;\r\n]*) will only capture 0 or of any characters that is not a ; and not a \r and not a \n thus stopping at end of line rather than going across the lines looking for a semi-colon.

With this change captured group #1 will have sessionToken=22A2...DB87

RegEx Demo

Upvotes: 2

Related Questions