ʞɔıu
ʞɔıu

Reputation: 48446

python re match unicode character

Having trouble matching unicode chars with a regex in python

# -*- coding: utf8 -*-

import re

locations = [
    "15°47'S 47°55'W",
    "21º 18' N, 157º 51' W",
    "32°46′58″N 96°48′14″W",
]

rx = re.compile(ur"""
    ^\d+[°º]
    |
    ^\d+[\xb0\xba]
    """, re.X)

for loc in locations:
    if not rx.match(loc):
        print loc

Result:

15°47'S 47°55'W
21º 18' N, 157º 51' W
32°46′58″N 96°48′14″W

Can't seem to match the unicode chars!

Upvotes: 2

Views: 2126

Answers (1)

kennytm
kennytm

Reputation: 523724

Because the locations aren't unicode strings.

locations = [
    u"15°47'S 47°55'W",
    u"21º 18' N, 157º 51' W",
    u"32°46′58″N 96°48′14″W",
]

Upvotes: 5

Related Questions