Reputation: 1027
This is the string that I want to parse: 2 Sep 27 Sep 28 SOME TEXT HERE 35.00
I want to parse it into a list so that the values look like:
list[0] = 'Sep 28'
list[1] = 'SOME TEXT HERE'
list[2] = '35.00'
The RegEx that I've been working on:
^\d{1}\s{1}[a-zA-Z]{3}\s{1}\d{2}\s{1}([a-zA-Z]{3}\s{1}\d{2})\s{1}([a-zA-Z0-9]*\s{1})+(\d+.\d+)
My values are:
list[0] = 'Sep 28'
list[1] = 'HERE'
list[2] = '35.00'
The list[1]
value is off. I'm also probably not parsing the spaces right, but I couldn't find any guidance in the "Pickaxe" book or online.
Upvotes: 0
Views: 470
Reputation: 7291
Your problem is in your second capture group:
([a-zA-Z0-9]*\s{1})+
The parenthesized group is repeated, matching each of the words 'SOME'
, 'TEXT'
, and 'HERE'
individually, leaving your second capture group with only the final match, 'HERE'
.
You need to put the +
inside the capturing parenthesized groups, and use non-capturing parentheses (?:...)
to enclose your existing group. Non-capturing parentheses, which use (?:
to start the group and )
to end the group, are a way in a regular expression to group parts of your match together without capturing the group. You can use repetition operators (+
, *
, {n}
, or {n,m}
) on a non-capturing group and then capture the entire expression:
((?:[a-zA-Z0-9]*\s{1})+)
In total:
/^\d{1}\s{1}[a-zA-Z]{3}\s{1}\d{2}\s{1}([a-zA-Z]{3}\s{1}\d{2})\s{1}((?:[a-zA-Z0-9]*\s{1})+)(\d+.\d+)/
As a side note, this is a pretty clunky regex. You never really need to specify {1}
in a regex as a single match is the default. Similarly, \d\d
is one character less typing than \d{2}
. Also, you probably just want \w
instead of [a-zA-Z0-9]
. Since you don't seem to care about case, you probably just want to use the /i
option and simplify the letter character classes. Something like this is a more idiomatic regular expression:
/^\d [a-z]{3} \d\d ([a-z]{3} \d\d) ((?:\w* )+)(\d+.\d+)/i
Finally, though the Ruby documentation for regular expressions is a little thin, Ruby uses somewhat standard Perl-compatible regular expressions, and you can find more information about regular expressions generally at regular-expressions.info
Upvotes: 4
Reputation: 381
You may have also been here and tried this tool, but I would highly recommend Rubular. It offers very quick string parsing.
It looks like you already got the specific answer to your question, so I just wanted to drop this in for other people coming by so they can know where to go test their regex or just practice.
Upvotes: 1