Reputation: 1474
I've written a simple function that strips a string of all non-alpha characters keeping spaces in place.
Currently it relies on using two regular expressions. However, in in interest of brevity I'd like to reduce those two reg exs into one. Is this possible?
import re
def junk_to_alpha(s):
reg = r"[^A-Za-z]"
p = re.compile(reg)
s = re.sub(p, " ", s)
p = re.compile(r"\s+")
s = re.sub(p, " ", s)
return s
print junk_to_alpha("Spoons! 12? \/@# ,.1 12 Yeah? {[]}")
# Spoons Yeah
Upvotes: 3
Views: 1745
Reputation: 626870
You may enclose the [^a-zA-Z]+
with \s*
:
import re
def junk_to_alpha(s):
s = re.sub(r"\s*[^A-Za-z]+\s*", " ", s)
return s
print junk_to_alpha("Spoons! 12? \/@# ,.1 12 Yeah? {[]}")
See the online Python demo
The pattern details:
\s*
- zero or more whitespaces[^A-Za-z]+
- 1 or more characters other than ASCII letters\s*
- see above.Upvotes: 4