Stripping a character only once in python

Question

I am parsing values from a file, some of which can be string literals, enclosed in double quotes. To get the actual value I have to strip the double quotes:

>>> raw_value = r'"I am a string"'
>>> processed_value = raw_value.strip('"')
>>> print(processed_value)
I am a string

However, some values contain escaped double quotes, which can be at the end:

>>> raw_value = r'"Simon said: "Jump!""'
>>> processed_value = raw_value.strip('"')
>>> print(processed_value)
Simon said: "Jump!\

You see my problem here: the escaped double quote is stripped away which leaves an orphaned double quote when I write the file back and makes it unreadable. I could do:

def unique_strip(some_str):

    beginning = 1 if some_str.startswith('"') else 0
    end = -1 if some_str.endswith('"') and some_str[-2] != "\" else None
    return some_str[beginning:end]

Using previous example:

>>> unique_strip(raw_value)
'Simon said: \"Jump!\"'
>>> raw_value = r'"Simon said: "Jump!"'
>>> unique_strip(raw_value)
'Simon said: \"Jump!\"'

So now it even works if the trailing double quote is missing. Is there a more pythonic way to do this, using built-in strip for example ? If not, is there anything wrong or any loophole in my method ?

Update

I guess my function raises IndexError for an input like some_str = '"'. So maybe:

def unique_strip(some_str):

    beginning = 1 if some_str.startswith('"') else 0
    end = -1 if len(some_str) > 1 and some_str.endswith('"') and some_str[-2] != "\" else None
    return some_str[beginning:end]

jf328 · Accepted Answer

The easiest but not the safest way is to replace the " with some string that will not occur elsewhere. Then strip, and replace back.

raw_value = r'"Simon said: "Jump!""'

IMPOSSIBLE_STR = '\"3'
raw_value.replace('\"', IMPOSSIBLE_STR).strip('"').replace(IMPOSSIBLE_STR,'\"')
Out[102]: 'Simon said: \"Jump!\"'

I suppose it's very unlikely to have " followed by a number.

Regex will probably solve the problem better, conditioned on that you write the correct regex!

Stripping a character only once in python

Answers (1)

Related Questions