Haytham
Haytham

Reputation: 854

match Float number inside string with regex in Java

I am trying to find a float number after a specific word with regex in java , but I am only getting it when there is nothing between the word and the float number , but I want to get it even there are white spaces any other characters and new lines new lines .

Here the regex that I made :

(?<=TOTAL)([+-]?([0-9]*[.])?[0-9]+)

Example :

69003 LYON 03 ejuodnid 04 72.84.75.20 affm groa TICKET FACTURE 361203- SEPHORA EYE PALET LIG PALE 29991 14.99 Sephora Collection -Prix 392729 SEPHORA CINESCOPE 16.501 328451- SEPHORA THE MASC BIG MASC goe( 6.99193.49 P Sephora Co11ection Prix 347597SEPHORA LING GRENADE NG 25 i5.99 1) 2.99 Sephora Collect1o0 PriX adoy (30 00 1o)o 6.00 oniop20% achats Black Mars 2019 451087 OFFRE 20%ACHATS 15.00 MASC 16.50 3.50 3.00 N'2 24.00 VPBLA 0.00 tnoe 0001* 1eepom TOTAL EUR 62.00

Upvotes: 1

Views: 116

Answers (2)

Zzyzx
Zzyzx

Reputation: 531

I interpret the question as that you want to extract the first float number after a certain word, no matter what is in between. A non-greedy wildcard will simply do that for you.

(?<=TOTAL).*?([+-]?([0-9]*[.])?[0-9]+)

Upvotes: 1

Ryszard Czech
Ryszard Czech

Reputation: 18621

Use

\bTOTAL\b[\s\S]*?([+-]?\d*\.?\d+)

See proof

Explanation

--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
--------------------------------------------------------------------------------
  TOTAL                    'TOTAL'
--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
--------------------------------------------------------------------------------
  [\s\S]*?                 any character of: whitespace (\n, \r, \t,
                           \f, and " "), non-whitespace (all but \n,
                           \r, \t, \f, and " ") (0 or more times
                           (matching the least amount possible))
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [+-]?                    any character of: '+', '-' (optional
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \d*                      digits (0-9) (0 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
    \.?                      '.' (optional (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1

Java code:

String regex = "\\bTOTAL\\b[\\s\\S]*?([+-]?\\d*\\.?\\d+)";
String string = "69003 LYON 03 ejuodnid 04 72.84.75.20 affm groa TICKET FACTURE 361203- SEPHORA EYE PALET LIG PALE 29991 14.99 Sephora Collection -Prix 392729 SEPHORA CINESCOPE 16.501 328451- SEPHORA THE MASC BIG MASC goe( 6.99193.49 P Sephora Co11ection Prix 347597SEPHORA LING GRENADE NG 25 i5.99 1) 2.99 Sephora Collect1o0 PriX adoy (30 00 1o)o 6.00 oniop20% achats Black Mars 2019 451087 OFFRE 20%ACHATS 15.00 MASC 16.50 3.50 3.00 N'2 24.00 VPBLA 0.00 tnoe 0001* 1eepom TOTAL EUR 62.00";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if (matcher.find()) {
    System.out.println(matcher.group(1));
}

Result: 62.00

Upvotes: 2

Related Questions