Mihika
Mihika

Reputation: 535

Regular expression in Python for detecting numbers that are integers, floats or in the scientific notation

How can I create a regular expression for accepting numbers in Python? The numbers can be either integers, floats or of the format 3e+3 or 3e-3.

I want to match only the beginning of the string, and if a number in any of the above mentioned formats is present, return that number and the rest of the string.

Edit:

For example,

Input>> 290.07abcd Output>> [290.07, abcd]

Input>> abc123 Output>> None

Also, only the first occurrence is to be checked for.

For example,

Input>> -390-400abc

Output>>[-390, -400abc]

How can I do this using Python? I have tried the following, but it is not giving me the expected output:

import re
r = input()
x = re.search('^[+-]?\d*(\.\d+)?([+-][eE]\d+)?', r)
if x:
    print("x present: ", x.group())
else:
    print(None)

For example,

Input>> 100abc

Output>> x present: 100


Input>> abc100

Output>> x present:

Expected Output>> None

Upvotes: 2

Views: 2064

Answers (3)

Warren Weckesser
Warren Weckesser

Reputation: 114811

Here's one possibility. The pattern for a number is

number_pattern = "[+-]?((\d+\.\d*)|(\.\d+)|(\d+))([eE][+-]?\d+)?"

The pattern consists of:

  • optional sign;
  • three alternatives for the main part of the number:
    • one or more digits, followed by a decimal point, followed by zero or more digits;
    • a decimal point, followed by one or more digits;
    • one or more digits (no decimal point);
  • optional exponential part, consisting of:
    • e or E;
    • optional sign;
    • one or more digits.

The first and third alternatives for the main part of the number can be combined to consist of one or more digits, optionally followed by a decimal point followed by zero or more digits. The number pattern is then

number_pattern = "[+-]?((\d+(\.\d*)?)|(\.\d+))([eE][+-]?\d+)?"

You can use this to create a function that does what you asked:

pattern = "(" + number_pattern + ")(.*)"
compiled = re.compile(pattern)

def number_split(s):
    match = compiled.match(s)
    if match is None:
        return None
    groups = match.groups()
    return groups[0], groups[-1]

Some examples:

In [4]: print(number_split("290.07abcd"))
('290.07', 'abcd')

In [5]: print(number_split("abc123"))
None

In [6]: print(number_split("-390-400abc"))
('-390', '-400abc')

In [7]: print(number_split("0.e-3"))
('0.e-3', '')

In [8]: print(number_split("0x"))
('0', 'x')

In [9]: print(number_split(".123e2"))
('.123e2', '')

Upvotes: 2

Code Maniac
Code Maniac

Reputation: 37755

You can use this

^[+-]?\d*(\.\d+)?([+-][eE]\d+)?$
  • ^ - Start of string.
  • [+-]- Matches + or -.
  • \d* - Matches zero or more digits.
  • (\.\d+)? - Matches . followed by one or more digit.
  • ([+-][eE]\d+)? - Matches + or - followed by e or E followed by digits.
  • $ - End of string.

Demo

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521194

Try this pattern:

\d+(\.\d+)?(e[+-]\d+)?

This matches:

100
100.123
100e+3
100.123e-3

Demo

Upvotes: 1

Related Questions