Levi
Levi

Reputation: 12482

Backslashes being added into my cookie in Python

I am working with Python's SimpleCookie and I ran into this problem and I am not sure if it is something with my syntax or what. Also, this is classwork for my Python class so it is meant to teach about Python so this is far from the way I would do this in the real world.

Anyway, so basically I am keeping information input into a form in a cookie. I am attempting to append to the previous cookie with the new information entered. But for some reason on the third entry of data the cookie suddenly gets "\" in it. I am not sure where they are coming from though.

This is the type of output I am getting:

"\"\\"\\\\"test:more\\\\":rttre\\":more\":and more"

#!/usr/local/bin/python

import cgi,os,time,Cookie
#error checking
import cgitb
cgitb.enable()

if 'HTTP_COOKIE' in os.environ:
    cookies = os.environ['HTTP_COOKIE']
    cookies = cookies.split('; ')
    for myCookie in cookies:
        myCookie = myCookie.split('=')
        name = myCookie[0]
        value = myCookie[1]
        if name == 'critter' :
            hideMe = value

#import critterClass

#get info from form
form = cgi.FieldStorage()
critterName = form.getvalue('input')
input2 = form.getvalue('input2')
hiddenCookie = form.getvalue('hiddenCookie')
hiddenVar = form.getvalue('hiddenVar')

#make cookie
cookie = Cookie.SimpleCookie()

#set critter Cookie
if critterName is not None:
    cookie['critter'] = critterName
#If already named
else:
    #if action asked, append cookie
    if input2 is not None:
        cookie['critter'] = hideMe+":"+input2
    else:
        cookie['critter'] = "default"

print cookie


print "Content-type: text/html\n\n"



if ((critterName is None) and (input2 is None)):
    print """
    <form name="critter" id="critter" method="post" action="critter.py">
    <label for="name">Name your pet: <input type="text" name="input" id="input" /></label>
    <input type="submit" name="submit" id="submit" value="Submit" />
    </form>
    """
else:
    formTwo ="""
    <form name="critter2" id="critter2" method="post" action="critter.py">
    <label for="name">%s wants to: <input type="text" name="input2" id="input2" /></label>
    <input type="hidden" name="hiddenVar" id="hiddenVar" value="%s" />
    <input type="submit" name="submit" id="submit" value="Submit" />
    </form>
    [name,play,feed,mood,levels,end]
    """
    print formTwo % (critterName,critterName)

if 'HTTP_COOKIE' in os.environ:
    cookies = os.environ['HTTP_COOKIE']
    cookies = cookies.split('; ')
    for myCookie in cookies:
        myCookie = myCookie.split('=')
        name = myCookie[0]
        value = myCookie[1]
        if name == 'critter' :
            print "name"+name
            print "value"+value

Upvotes: 7

Views: 2512

Answers (6)

gimel
gimel

Reputation: 86482

As explained by others, the backslashes are escaping double quote characters you insert into the cookie value. The (hidden) mechanism in action here is the SimpleCookie class. The BaseCookie.output() method returns a string representation suitable to be sent as HTTP headers. It will insert escape characters (backslashes) before double quote characters and before backslash characters.

The

print cookie

statement activates BaseCookie.output().

On each trip your string makes through the cookie's output() method, backslashes are multiplied (starting with the 1st pair of quotes).

>>> c1=Cookie.SimpleCookie()
>>> c1['name']='A:0'
>>> print c1
Set-Cookie: name="A:0"
>>> c1['name']=r'"A:0"'
>>> print c1
Set-Cookie: name="\"A:0\""
>>> c1['name']=r'"\"A:0\""'
>>> print c1
Set-Cookie: name="\"\\\"A:0\\\"\""
>>> 

Upvotes: 3

Jeff Shannon
Jeff Shannon

Reputation: 10153

Others have already pointed out that this is a result of backslash-escapes of quotes and backslashes. I just wanted to point out that if you look carefully at the structure of the output you cite, you can see how the structure is being built here.

The cookie value that you're getting from SimpleCookie is wrapped in quotes -- the (unprocessed) cookie has, e.g.,

`'[...], critter="value1", [...]'`

After you split on ', ' and '=', you have a string that contains "value1". You then append a new value to that string, so that the string contains "value1":value2.

The next time through, you get that string back, but with another set of quotes wrapping it -- conceptually, ""value1":value2". But in order to make it so that a web browser will not see two quote characters at the beginning and think that's all there is, the inner set of quotes is being escaped, so it's actually returned as "\"value1\":value2".

You then append yet another chunk, make another pass back and forth between server and client, and the next time (because those backslashes need escaped now too) you get "\"\\"value1\\":value2\":value3". And so on.

The correct solution, as has already been pointed out, is to let SimpleCookie do the parsing instead of chopping up the strings yourself.

Upvotes: 1

Mike Boers
Mike Boers

Reputation: 6745

As others have already said, you are experiencing string escaping issues as soon as you add "and more" onto the end of the cookie.

Until that point, the cookie header is being returned from the SimpleCookie without enclosing quotes. (If there are no spaces in the cookie value, then enclosing quotes are not needed.)

# HTTP cookie header with no spaces in value
Set-Cookie: cookie=value

# HTTP cookie header with spaces in value
Set-Cookie: cookie="value with spaces"

I would suggest using the same SimpleCookie class to parse the cookie header initially, saving you from doing it by hand, and also handling unescaping the strings properly.

cookies = Cookie.SimpleCookie(os.environ.get('HTTP_COOKIE', ''))
print cookies['critter'].value

edit: This whole deal with the spaces does not apply to this question (although it can in certain circumstances come and bite you when you are not expecting it.) But my suggestion to use the SimpleCookie to parse still stands.

Upvotes: 2

MarkusQ
MarkusQ

Reputation: 21950

Backslashes are used for "escaping" characters in strings that would otherwise have special meaning, in effect depriving them of their special meaning. The classic case is the way you can include quotes in quoted strings, such as:

Bob said "Hey!"

which can be written as a string this way:

"Bob said \"Hey!\""

Of course, you may want to have a regular backslash in there, so "\" just means a single backslash.

EDIT: In response to your comment on another answer (about using a regexp to remove the slashes) I think you're picking up the wrong end of the stick. The slashes aren't the problem, they are a symptom. The problem is that you're doing round trips treating strings representing quoted strings as if they were plain old strings. Imagine two friends, Bob and Sam, having a conversation:

Bob:  Hey!
Sam:  Did you say "Hey!"?
Bob:  Did you say "Did you say \"Hey!\"?"?

That's why the don't show up until the third time.

Upvotes: 1

tpdi
tpdi

Reputation: 35171

The slashes result from escaping the double quotes. Apparently, the first time through, your code is seeing the double quote, and escaping it by adding a back-slash. Then it reads the escaped backslash, and escapes the backslash by prepending it with -- a backslash. Then it reads....

The problem is happening when you call append.

Upvotes: 2

unwind
unwind

Reputation: 400039

I'm not sure, but it looks like regular Python string escaping. If you have a string containing a backslash or a double quote, for instance, Python will often print it in escaped form, to make the printed string a valid string literal.

The following snippet illustrates:

>>> a='hell\'s bells, \"my\" \\'
>>> a
'hell\'s bells, "my" \\'
>>> print a
hell's bells, "my" \

Not sure if this is relevant, perhaps someone with more domain knowledge can chime in.

Upvotes: 2

Related Questions