ebbishop
ebbishop

Reputation: 1983

Regex: match a character except at the beginning of a string

I'm trying to strip a character from a string, unless that character is at the beginning of a string.

So far, my code looks like this:

def strip_string(value):
  return re.sub(r"[^0-9\.]",'',value)

# strip_string('1-23') => '123'

I want to remove only the dashes that aren't the first character though:

strip_string('-1-23') => '-123'

I know how to target dashes that are the first character (r"^-"), but not the inverse.

Is it possible to do this, or do I need to go about it differently?

Upvotes: 2

Views: 1161

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

The simplest solution to remove a character from a string that is not at the beginning is to use a (?!^) / (?!\A) negative lookahead. However, you can't just use re.sub(r"(?!^)[^0-9.]",'',value) as it won't remove non-hyphen chars either, while your scenario implies you expect to only keep a hyphen at the start.

Thus, in Python 3.5 and newer you may use (see demo):

re.sub(r"^(-)|[^0-9.]+", r"\1", value)

Or, you may fall back to

re.sub(r"(?!^)-|[^0-9.-]+", "", value)   # This one is somewhat easier to understand
re.sub(r"-(?<!^-)|[^0-9.-]+", "", value) # This one is a bit more efficient

See demo #1 and demo #2.

Both -(?<!^-) and (?!^)- match a - that is not at the start of a string.

Upvotes: 3

Related Questions