newBike
newBike

Reputation: 15002

How to escape special char

I got the following code to handle Chinese character problem, or some special character in powerpoint file , because I would like to use the content of the ppt as the filename to save. If it contains some special character, it will throw some exception, so I use the following code to handle it.

It works fine under Python 2.7 , but when I run with Python 3.0 it gives me the following error :

    if not (char in '<>:"/\|?*'):
TypeError: 'in <string>' requires string as left operand, not int

I Googled the error message but I don't understand how to resolve it. I know the code if not (char in '<>:"/\|?*'): is to convert the character to ASCII code number, right?

Is there any example to fix my problem in Python 3?

def rm_invalid_char(self,str):

    final=""
    dosnames=['CON', 'PRN', 'AUX', 'NUL', 'COM1', 'COM2', 'COM3', 'COM4', 'COM5', 'COM6', 'COM7', 'COM8', 'COM9', 'LPT1', 'LPT2', 'LPT3', 'LPT4', 'LPT5', 'LPT6', 'LPT7', 'LPT8', 'LPT9']
    for char in str:
        if not (char in '<>:"/\|?*'):
            if ord(char)>31:
                final+=char
        if final in dosnames:
            #oh dear...
            raise SystemError('final string is a DOS name!')
        elif final.replace('.', '')=='':
            print ('final string is all periods!')
            pass
    return final

Upvotes: 0

Views: 1628

Answers (3)

Eric O. Lebigot
Eric O. Lebigot

Reputation: 94565

You are passing an iterable whose first element is an integer (232) to rm_invalid_char(). The problem does not lie with this function, but with the caller.

Some debugging is in order: right at the beginning of rm_invalid_char(), you should do print(repr(str)): you will not see a string, contrary to what is expected by rm_invalid_char(). You must fix this until you see the string that you were expecting, by adjusting the code before rm_invalid_char() is called.

The problem is likely due to how Python 2 and Python 3 handle strings (in Python 2, str objects are strings of bytes, while in Python 3, they are strings of characters).

Upvotes: 0

bbayles
bbayles

Reputation: 4527

I'm curious why there is something in "str" that is acting like an integer - something strange is going on with the input.

However, I suspect if you:

  • Change the name of your str value to something else, e.g. char_string
  • Right after for char in char_string coerce whatever your input is to a string

then the problem you describe will be solved.

You might also consider adding a random bit to the end of your generated file name so you don't have to worry about colliding with the DOS reserved names.

Upvotes: 0

KennyV
KennyV

Reputation: 832

Simple: use this

re.escape(YourStringHere)

From the docs:

Return string with all non-alphanumerics backslashed; this is useful if you want to match an arbitrary literal string that may have regular expression metacharacters in it.

Upvotes: 1

Related Questions