Reputation: 633
I'm currently building a code generator, which aims to generate boiler plate for me once I write the templates and/or translations, in whatever language I have to work with.
I have a problem with a regex in Ruby. The regex aims to select whatever is between {{{
and }}}
, so I can generate functions according to my needs.
My regex is currently :
/\{\{\{(([a-zA-Z]|\s)+)\}\}\}/m
My test data set is:
{{{Demande aaa}}} => {{{tagadatsouintsouin tutu}}}
The results are:
[["Demande aaa", "a"], ["tagadatsouintsouin tutu", "u"]]
Each time the regex picks the last character twice. That's not exactly what I want, I need something more like this:
/\{\{\{((\w|\W)+)\}\}\}/m
But this has a flaw too, the results are:
[["Demande aaa}}} => {{{tagadatsouintsouin tutu", "u"]]
Whereas, I wish to get:
[["Demande aaa"],["tagadatsouintsouin tutu"]]
How do I correct these regexes? I could use two sets of delimiters, but it won't teach me anything.
Edit :
All your regex run against my data sample, so you all got a point.
Regex may be overkill, and probably are overkill for my purpose. So i have two questions.
First, do the regex keep the same exact indentation ? This should be able to handle whole functions.
Second, is there something fitter for that task ?
Detailled explanation of the purpose of this tool. I'm bored to write boiler plate code in php - symfony. So i wish to generate this according to templates.
My intent is to build some views, some controllers, and even parts of model this way.
Pratical example : In my model, i wish to generate some functions according to the type of an object's attribute. For examples, i have functions displaying correctly money. So i need to build the corect function, according to my attribute, and then put in , inside m output file.
So there is some translations which themselves need translations.
So a fictive example :
{{{euro}}} => {{{ function getMyAttributeEuro()
{
return formating($this->get[[MyAttribute]]);
} }}}
In order to stock my translations, should i use regex, like
I wish to build something a bit clever, so it can build most of the basic code with no bug. So i can work on interesting code.
Upvotes: 0
Views: 162
Reputation: 5452
Just a shot
/\{\{\{([\w\W]+?)\}\}\}/
Added non-greedyness to your regex
Here this seems to work
Upvotes: 1
Reputation: 160551
I'm partial to:
data = '{{{Demande aaa}}} => {{{tagadatsouintsouin tutu}}}'
data.scan(/\{{3}(.+?)}{3}/).flatten.map{ |r| r.squeeze(' ') }
=> ["Demande aaa", "tagadatsouintsouin tutu"]
or:
data.scan(/\{{3}(.+?)}{3}/).flatten.map{ |r| [ r.squeeze(' ') ] }
=> [["Demande aaa"], ["tagadatsouintsouin tutu"]]
or:
data.scan(/\{{3}(.+?)}{3}/).map{ |r| [ r[0].squeeze(' ') ] }
=> [["Demande aaa"], ["tagadatsouintsouin tutu"]]
if you need the sub-arrays.
I'm not big on trying to everything possible inside the regex. I prefer to keep it short and sweet, then polish the output once I've found what I was looking for. It's a maintenance issue, because regex make my head hurt, and I stopped thinking of them as a macho thing years ago. Regex are a very useful tool, but too often they are seen as the answer to every problem, which they're not.
Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.
Upvotes: 2
Reputation: 168081
You want non capturing groups (?:...)
, but here is another way.
/\{\{\{(.*?)\}\}\}/m
Upvotes: 1
Reputation: 336108
You have one set of capturing parentheses too many.
/\{\{\{([a-zA-Z\s]+)\}\}\}/
Also, you don't need the /m
modifier because there is no dot (.
) in your regex whose behaviour would be affected by it.
Upvotes: 4