user1989504
user1989504

Reputation: 143

Same regex search results differ in C++ and Java

Output using C++ and Java utility matcher the results obtained differ.

In C++, I use regex the following way:

#include <iostream>
#include <string>
#include <regex>

int main()
{
    std::string lines[] = {"https12e345d"};

    std::regex color_regex("^(?:http|https)([A-Fa-f0-9].*)$");

    for (const auto &line : lines) {
        std::cout << line << ": " 
                  << std::regex_search(line, color_regex) << '\n';
    }   

    std::smatch color_match;
    for (const auto &line : lines) {
        std::regex_search(line, color_match, color_regex);
        std::cout << "matches for '" << line << "'\n";
        for (size_t i = 0; i < color_match.size(); ++i)
            std::cout << i << ": " << color_match[i] << '\n';
    }   
}

Using Java:

import java.util.*;
import java.lang.*;
import java.io.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/* Name of the class has to be "Main" only if the class is public. */
class Ideone
{
    public static final String EXAMPLE_TEST = "https12e345d";
    public static void main (String[] args) throws java.lang.Exception
    {
        Pattern pattern = Pattern.compile("^(?:http|https)([A-Fa-f0-9].*)$");

    Matcher matcher = pattern.matcher(EXAMPLE_TEST);
    // check all occurance
    while (matcher.find()) {
      System.out.print("Start index: " + matcher.start());
      System.out.print(" End index: " + matcher.end() + " ");
      System.out.println(matcher.group());
    }    
    }
}

C++ output is:

https12e345d
12e345d

Java output is:

https12e345d

Is there any issue in the regex?

Upvotes: 1

Views: 1679

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627327

The difference in the outputs is because in C++ code you iterate over the captured groups with

for (size_t i = 0; i < color_match.size(); ++i)
        std::cout << i << ": " << color_match[i] << '\n';

Since there are 2 groups, the 0th group (the whole matched text) and 1st group (the one captured with (...)) you have two strings in the output.

With Java code,

while (matcher.find()) {
  System.out.println(matcher.group());
}

you iterate over matches (there is only 1 match, thus, you only have 1 output) and you print out the whole matched text (in C++, it was color_match[0]). If you want the same output as in C++, in Java code, add

System.out.println(matcher.group(1));

Upvotes: 1

Related Questions