Lucas Pereira
Lucas Pereira

Reputation: 569

python string encoding issue on dealing with accentuation

I have python script that prints a string value with an incorrect encoding. I've tried setting a string s with the same string as the value and it print fine. I've print also their type and they both are strings. This is what the code looks like:

s = "\xC3\xBA"
print s
print type(s)

print value
print type(value)

and This is the output:

ú
<type 'str'>
\xC3\xAD
<type 'str'>

the value output should be úinstead of \xC3\xAD . How come when I set the s string to \xC3\xAD it is printed correctly?! Does anyone have an idea?

The value is set this way:

apps = data.split('-') 
for app in apps:
    app_data = app.split('\n')
    app_new = {}
    for app_field in app_data:
        key_value = app_field.split(':')
        if len(key_value) == 2:
            key = key_value[0].lstrip().rstrip().lower()
            value = key_value[1].lstrip().rstrip()

Upvotes: 1

Views: 115

Answers (1)

Izkata
Izkata

Reputation: 9323

I would guess that your backslashes somehow got escaped as well:

In [1]: value = "\\xC3\\xBA"

In [2]: print value
\xC3\xBA

In [3]: type(value)
Out[3]: <type 'str'>

Upvotes: 1

Related Questions