Arindam Roychowdhury
Arindam Roychowdhury

Reputation: 6511

Regular Expression for tackling special symbols

I have an example here :

>>> txt1
'fdf\\.\\..dgg'

I intend to find a regex that will return me the special symbols .

So i tried this .

>>> ans=re.search("\w+[\|.]*\w+",txt1)
>>> ans.group()
'fdf'

The \w+ will find words continuing. The [\|.] was supposed to find \ or . (dot) . The star was supposed to continue for next entry. Again, the \w+ was supposed to find trailing words.

Please guide what is wrong here? Or the concept is not exactly what i think it is ... Thanks in advance to all.... As you can see , the idea is not working .

Upvotes: 0

Views: 202

Answers (4)

gsbabil
gsbabil

Reputation: 7703

Since you want to find special symbols, re.findall(r"[a-z]*([.\\] ?)[a-z]*", txt1) will return your symbols as a list. You can always join() them as needed (example shown below):

>>> 
>>> txt1
'fdf\\.\\..dgg'
>>> ans = re.findall(r"[a-z]*([.\\] ?)[a-z]*", txt1)
>>> ans
['\\', '.', '\\', '.', '.']
>>> 
>>> "".join(ans)
'\\.\\..'
>>> 

Upvotes: 0

Borodin
Borodin

Reputation: 126742

You can't use the alternation operator | in a character class. Inside [ ] a pipe stands for exactly that character. Your backslash escapes it (unnecessarily) so you are looking for pipes or dots. What you want is

ans=re.search(r"\w+[\\.]*\w+", txt1)

Upvotes: 1

Aram Kocharyan
Aram Kocharyan

Reputation: 20431

If you want to find anything in there that isn't alphanumeric (including spaces) then use:

[^\w]+

Upvotes: 1

eumiro
eumiro

Reputation: 213005

"I intend to find a regex that will return me the special symbols."

re.search(r"\w+([\\\.]*)\w+", txt1)

finds with ans.group(1) what you need:

ans = re.search(r"\w+([\\\.]*)\w+", txt1)
ans.group(1)

# '\\.\\..'

The [] designs a group of characters (without the | "or"), but you have to escape the backslash and dot with backslash \. -> \\\. to match it.

Upvotes: 1

Related Questions