Andriy Drozdyuk
Andriy Drozdyuk

Reputation: 61061

Convert a regex expression to erlang's re syntax?

I am having hard time trying to convert the following regular expression into an erlang syntax.

What I have is a test string like this:

1,2 ==> 3 #SUP: 1 #CONF: 1.0

And the regex that I created with regex101 is this (see below):

([\d,]+).*==>\s*(\d+)\s*#SUP:\s*(\d)\s*#CONF:\s*(\d+.\d+)

Regex:

But I am getting weird match results if I convert it to erlang - here is my attempt:

{ok, M} = re:compile("([\\d,]+).*==>\\s*(\\d+)\\s*#SUP:\\s*(\\d)\\s*#CONF:\\s*(\\d+.\\d+)").
re:run("1,2 ==> 3 #SUP: 1 #CONF: 1.0", M).

Also, I get more than four matches. What am I doing wrong?

Here is the regex101 version: https://regex101.com/r/xJ9fP2/1

Upvotes: 3

Views: 555

Answers (2)

rock321987
rock321987

Reputation: 11032

I don't know much about erlang, but I will try to explain. With your regex

>{ok, M} = re:compile("([\\d,]+).*==>\\s*(\\d+)\\s*#SUP:\\s*(\\d)\\s*#CONF:\\s*(\\d+.\\d+)").
>re:run("1,2 ==> 3 #SUP: 1 #CONF: 1.0", M).                                                  
{match,[{0, 28},{0,3},{8,1},{16,1},{25,3}]}
         ^^ ^^
         || ||
         || Total number of matched characters from starting index
   Starting index of match

Reason for more than four groups

First match always indicates the entire string that is matched by the complete regex and rest here are the four captured groups you want. So there are total 5 groups.

([\\d,]+).*==>\\s*(\\d+)\\s*#SUP:\\s*(\\d)\\s*#CONF:\\s*(\\d+.\\d+)
<------->         <---->             <--->              <--------->
First group    Second group       Third group           Fourth group
<----------------------------------------------------------------->
This regex matches entire string and is first match you are getting
                      (Zero'th group)

How to find desired answer

Here we want anything except the first group (which is entire match by regex). So we can use all_but_first to avoid the first group

> re:run("1,2 ==> 3 #SUP: 1 #CONF: 1.0", M, [{capture, all_but_first, list}]).                
{match,["1,2","3","1","1.0"]}

More info can be found here

Upvotes: 4

Hynek -Pichi- Vychodil
Hynek -Pichi- Vychodil

Reputation: 26121

If you are in doubt what is content of the string, you can print it and check out:

1> RE = "([\\d,]+).*==>\\s*(\\d+)\\s*#SUP:\\s*(\\d)\\s*#CONF:\\s*(\\d+.\\d+)".
"([\\d,]+).*==>\\s*(\\d+)\\s*#SUP:\\s*(\\d)\\s*#CONF:\\s*(\\d+.\\d+)"
2> io:format("RE: /~s/~n", [RE]).
RE: /([\d,]+).*==>\s*(\d+)\s*#SUP:\s*(\d)\s*#CONF:\s*(\d+.\d+)/

For the rest of issue, there is great answer by rock321987.

Upvotes: 0

Related Questions