Reputation: 1747
With module Text.Regex.Posix,I can check if a string matchs a regex expression,but I don't know how to capture element in the string
For example,I can capture 3 element by fsharpx in this way:
Match @"(?i:MAIL\s+FROM:\s*<([a-zA-Z0-9]+)@([a-zA-Z0-9]+(\.[a-zA-Z0-9]+)+)>\s*(SIZE=([0-9]+))*)" mailMatch ->
I can catch
([a-zA-Z0-9]+) by mailMatch.Groups.[0].ToString()
([a-zA-Z0-9]+(\.[a-zA-Z0-9]+)+) by mailMatch.Groups.[1].ToString()
([0-9]+))* by mailMatch.Groups.[2].ToString()
but I don't know how to do this in haskell
I need some example,thanks!
Upvotes: 2
Views: 956
Reputation: 476594
First of all the regex you show is, as far as I know not a POSIX regex. So you should import Text.Regex.PCRE
instead of import Text.Regex.Posix
, since this is a more extended version of regexes.
Secondly, the regex itself, should escape the backslashes, so you should rewrite:
regex = "(?i:MAIL\s+FROM:\s*<([a-zA-Z0-9]+)@([a-zA-Z0-9]+(\.[a-zA-Z0-9]+)+)>\s*(SIZE=([0-9]+))*)"
into:
regex = "(?i:MAIL\\s+FROM:\\s*<([a-zA-Z0-9]+)@([a-zA-Z0-9]+(\\.[a-zA-Z0-9]+)+)>\\s*(SIZE=([0-9]+))*)"
and now we can use the (=~)
operator:
Prelude Text.Regex.PCRE> "MAIL FROM: <[email protected]> SIZE=1" =~ regex :: [[String]]
[["MAIL FROM: <[email protected]> SIZE=1","foo","bar.com",".com","SIZE=1","1"]]
We here thus specify that the result is a list of lists of strings [[String]]
. Every sublist is a match of the regex. So in case the text occurs three matches, we have three sublists. For every sublist, we see the captures. The first capture is the full match, the second capture is capture group 1, etc.
If you know for sure that there will only be one match, you can for instance use:
[[_,user,domain,topdomain,_,size]] = "MAIL FROM: <[email protected]> SIZE=1" =~ regex :: [[String]]
Then the result is:
Prelude Text.Regex.PCRE> user
"foo"
Prelude Text.Regex.PCRE> domain
"bar.com"
Prelude Text.Regex.PCRE> topdomain
".com"
Prelude Text.Regex.PCRE> size
"1"
Mind that this kind of pattern matching tends to be unsafe, so you better work with a more safe and total solution in your program.
Upvotes: 5