PolyGeo
PolyGeo

Reputation: 1400

Removing leading digits from string using Python?

I see many questions asking how to remove leading zeroes from a string, but I have not seen any that ask how to remove any and all leading digits from a string.

I have been experimenting with combinations of lstrip, type function, isdigit, slice notation and regex without yet finding a method.

Is there a simple way to do this?

For example:

Upvotes: 4

Views: 7962

Answers (3)

spectras
spectras

Reputation: 13552

I want to point out that though both Mitch and RebelWithoutAPulse's answer are correct, they do not do the same thing.

Mitch's answer left-strips any characters in the set '1', '2', '3', '4', '5', '6', '7', '8', '9', '0'.

>>> from string import digits
>>> digits
'0123456789'
>>> '123dog12'.lstrip(digits)
'dog12'

RevelWithoutAPulse's answern on the other hand, left-strips any character known to be a digit.

>>> import re
>>> re.sub('^\d+', '', '123dog12')
'dog12'

So what's the difference? Well, there are two differences:

  • There are many other digit characters than the indo-arabic numerals.
  • lstrip is ambiguous on RTL languages. Actually, it removes leading matching characters, which may be on the right side. Regexp's ^ operator is more straightforward about it.

Here are a few examples:

>>> '١٩٨٤فوبار٤٢'.lstrip(digits)
'١٩٨٤فوبار٤٢'
>>> re.sub('^\d+', '', '١٩٨٤فوبار٤٢')
'فوبار٤٢'

>>> '𝟏𝟗𝟖𝟒foobar𝟒𝟐'.lstrip(digits)
'𝟏𝟗𝟖𝟒foobar𝟒𝟐'
>>> re.sub('^\d+', '', '𝟏𝟗𝟖𝟒foobar𝟒𝟐')
'foobar𝟒𝟐'

(note for the Arabic example, Arabic being read from right to left, it is correct for the number on the right to be removed)

So… I guess the conclusion is be sure to pick the right solution depending on what you're trying to do.

Upvotes: 7

miradulo
miradulo

Reputation: 29730

A simple way could be to denote all digits with string.digits, which quite simply provides you with a string containing all digit characters '0123456789' to remove with string.lstrip.

>>> from string import digits
>>> s = '123dog12'
>>> s.lstrip(digits)
'dog12'

Upvotes: 15

RebelWithoutAPulse
RebelWithoutAPulse

Reputation: 442

Using regexes from re:

import re
re.sub('^\d+', '', '1234AB456')

Becomes:

'AB456'

replaces any positive amount of digits at the beginning of the string with empty string.

Upvotes: 1

Related Questions