Reputation: 3
I am trying to run the following command in Python:
data = "&city=Zayas de Báscones;Zayas de Báscones;"
arr = re.findall(ur'[&]{1}\w{4}=[a-zA-ZA-Za-z£€ßçÇáàâäæãåèéêëîïíìôöòóøõûüùúÿñÁÀÂÄÆÃÅÈÉÊËÎÏÍÌÔÖÒÓØÕÛÜÙÚŸÑðÐ]+(?:[\s-][a-zA-ZA-Za-z£€ßçÇáàâäæãåèéêëîïíìôöòóøõûüùúÿñÁÀÂÄÆÃÅÈÉÊËÎÏÍÌÔÖÒÓØÕÛÜÙÚŸÑðÐ]+)*',data)
x = "".join(arr)
x = x.split('&city=')
print x
The result:
['', 'Zayas de B?scones']
How can I get the unicode character instead of the question mark ? I have been trying to use the regex pattern with a 'u' character at the start of the string (e.g: u'pattern') and also 'ur' before the patttern.
Upvotes: 0
Views: 111
Reputation:
If you try to print x[1]
:
print x[1]
#output: Zayas de B?
Now if you treat your data
string as unicode.
data = u"&city=Zayas de Báscones;Zayas de Báscones;" # set it as unicode
If you try to print x[1]
:
print x[1]
#output: Zayas de Báscones
Upvotes: 1