Bram Vanroy
Bram Vanroy

Reputation: 28505

Posix classes in regex module Python

I installed the module regex (not re!) for Python 3.4.3 solely to be able to use POSIX classes such as [:graph:]. However, these don't seem to work.

import regex

sentence = "I like math, I divided ÷ the power ³ by ¾"

sentence = regex.sub("[^[:graph:]\s]","",sentence)

print(sentence)

Output: I like math, I divided ÷ the power ³ by ¾

Expected output: I like math, I divided the power by

It does work in PCRE though. So what am I missing here?

Upvotes: 3

Views: 1100

Answers (2)

Joseph Stover
Joseph Stover

Reputation: 427

Not sure about the regex module, but you can get the result with

import re

sentence = "I like math, I divided ÷ the power ³ by ¾"

sentence = re.sub("[^\x21-\x7E\s]","",sentence)

print(sentence)

There is a nice graph at http://www.regular-expressions.info/posixbrackets.html that shows how to convert the POSIX classes to ASCII, which the re module understands.

Upvotes: 1

vks
vks

Reputation: 67978

try sentence = regex.sub("[^[:graph:]\s]","",sentence,flags=regex.VERSION1)

You need to add flag regex.VERSION1

Upvotes: 1

Related Questions