Reputation: 633

Ruby, regex, sentences

I'm currently building a code generator, which aims to generate boiler plate for me once I write the templates and/or translations, in whatever language I have to work with.

I have a problem with a regex in Ruby. The regex aims to select whatever is between {{{ and }}}, so I can generate functions according to my needs.

My regex is currently :

/\{\{\{(([a-zA-Z]|\s)+)\}\}\}/m

My test data set is:

{{{Demande    aaa}}} => {{{tagadatsouintsouin    tutu}}}

The results are:

[["Demande aaa", "a"], ["tagadatsouintsouin tutu", "u"]]

Each time the regex picks the last character twice. That's not exactly what I want, I need something more like this:

/\{\{\{((\w|\W)+)\}\}\}/m

But this has a flaw too, the results are:

[["Demande aaa}}} => {{{tagadatsouintsouin tutu", "u"]]

Whereas, I wish to get:

[["Demande aaa"],["tagadatsouintsouin tutu"]]

How do I correct these regexes? I could use two sets of delimiters, but it won't teach me anything.

Edit :

All your regex run against my data sample, so you all got a point.

Regex may be overkill, and probably are overkill for my purpose. So i have two questions.

First, do the regex keep the same exact indentation ? This should be able to handle whole functions.

Second, is there something fitter for that task ?

Detailled explanation of the purpose of this tool. I'm bored to write boiler plate code in php - symfony. So i wish to generate this according to templates.

My intent is to build some views, some controllers, and even parts of model this way.

Pratical example : In my model, i wish to generate some functions according to the type of an object's attribute. For examples, i have functions displaying correctly money. So i need to build the corect function, according to my attribute, and then put in , inside m output file.

So there is some translations which themselves need translations.

So a fictive example :

{{{euro}}} => {{{ function getMyAttributeEuro()
 {
   return formating($this->get[[MyAttribute]]);
 } }}}

In order to stock my translations, should i use regex, like

I wish to build something a bit clever, so it can build most of the basic code with no bug. So i can work on interesting code.

Upvotes: 0

Answers (4)

Gabber

Reputation: 5452

Just a shot

/\{\{\{([\w\W]+?)\}\}\}/

Added non-greedyness to your regex

Here this seems to work

Upvotes: 1

the Tin Man

Reputation: 160551

I'm partial to:

data = '{{{Demande    aaa}}} => {{{tagadatsouintsouin    tutu}}}'
data.scan(/\{{3}(.+?)}{3}/).flatten.map{ |r| r.squeeze(' ') }
=> ["Demande aaa", "tagadatsouintsouin tutu"]

or:

data.scan(/\{{3}(.+?)}{3}/).flatten.map{ |r| [ r.squeeze(' ') ] }
=> [["Demande aaa"], ["tagadatsouintsouin tutu"]]

or:

data.scan(/\{{3}(.+?)}{3}/).map{ |r| [ r[0].squeeze(' ') ] }
=> [["Demande aaa"], ["tagadatsouintsouin tutu"]]

if you need the sub-arrays.

I'm not big on trying to everything possible inside the regex. I prefer to keep it short and sweet, then polish the output once I've found what I was looking for. It's a maintenance issue, because regex make my head hurt, and I stopped thinking of them as a macho thing years ago. Regex are a very useful tool, but too often they are seen as the answer to every problem, which they're not.