Reputation: 16796
I'm dealing with this python bug while writing my own reverse proxy. The server is sending my proxy this Set-Cookie
response header:
workgroup_session_id=ilDJtR0rE1AG28C9ZxKLHj8TBtcT89sw; Path=/; Expires=Sun, 02-Dec-2012 5:57:25 GMT; HttpOnly
I am loading this string into a SimpleCookie
instance from the Cookie
module. Unfortunately, because of the bug that I referenced above, when I later pull expires
out of the morsel dictionary it returns Sun,
. I have found that I can overcome this bug by adding quotes around the Expires
component of the Set-Cookie
header (or adding quotes around any key / value pair that contains spaces in the value).
So this:
workgroup_session_id=ilDJtR0rE1AG28C9ZxKLHj8TBtcT89sw; Path=/; Expires=Sun, 02-Dec-2012 5:57:25 GMT; HttpOnly
Would become:
workgroup_session_id=ilDJtR0rE1AG28C9ZxKLHj8TBtcT89sw; Path=/; Expires="Sun, 02-Dec-2012 5:57:25 GMT"; HttpOnly
And this:
test=a b c; Path=/; Expires=a b c; HttpOnly
Would become:
test="a b c"; Path=/; Expires="a b c"; HttpOnly
I know that I could break the string into tokens and loop through them looking for spaces, then reconstruct the string, but I am curious what the best performing solution would be. As I mentioned, this is a reverse proxy that could potentially handle a few hundred requests a second, so I'd like this substitution to be as fast as possible.
Would a regular expression substitution (pre-compiled of course) be efficient? I've heard that regular expressions are pretty heavy....
Upvotes: 0
Views: 128
Reputation: 336128
How about this regex:
import re
header = re.sub("(?<==)[^;]* [^;]*", r'"\g<0>"', header)
This inserts quotes around whatever follows after a =
until the next ;
(or end of string), but only if there is at least one space in-between.
>>> header = 'test=a b c; Path=/; Expires=a b c; HttpOnly'
>>> re.sub("(?<==)[^;]* [^;]*", r'"\g<0>"', header)
'test="a b c"; Path=/; Expires="a b c"; HttpOnly'
>>> header = "workgroup_session_id=ilDJtR0rE1AG28C9ZxKLHj8TBtcT89sw; Path=/; Expires=Sun, 02-Dec-2012 5:57:25 GMT; HttpOnly"
>>> re.sub("(?<==)[^;]* [^;]*", r'"\g<0>"', header)
'workgroup_session_id=ilDJtR0rE1AG28C9ZxKLHj8TBtcT89sw; Path=/; Expires="Sun, 02-Dec-2012 5:57:25 GMT"; HttpOnly'
Upvotes: 1
Reputation: 4056
Do you need to put quotes just around the date following Expires, or any arbitrary date that appears anywhere in the header? If it's the former, try this:
header = "workgroup_session_id=ilDJtR0rE1AG28C9ZxKLHj8TBtcT89sw; Path=/; Expires=Sun, 02-Dec-2012 5:57:25 GMT; HttpOnly"
print(header.replace('Expires=', 'Expires="').replace('GMT', 'GMT"'))
Upvotes: 1