Reputation: 2992
I apologize for the very specific issue I'm posting here but I hope it will help others that may also run across this issue. I have a string that is being formatted to the following:
[[,action1,,],[action2],[]]
I would like to translate this to valid YAML so that it can be parsed which would look like this:
[['','acton1','',''],['action2'],['']]
I've tried a bunch of regular expressions to accomplish this but I'm afraid that I'm at a complete loss. I'm ok with running multiple expressions if needed. For example (ruby):
puts s.gsub!(/,/,"','") # => [[','action1','',']','[action2]','[]]
puts s.gsub!(/\[',/, "['',") # => [['','action1','',']','[action2]','[]]
That's getting there, but I have a feeling I'm starting to go down a rat-hole with this approach. Is there a better way to accomplish this?
Thanks for the help!
Upvotes: 3
Views: 2388
Reputation: 34120
Since I don't know Ruby, Here is an example in Perl.
Since you only want a subset of YAML, that appears to be similar to JSON, I used the JSON
module.
I've been wanting an excuse to use Regexp::Grammars
, so I used it to parse the data.
I guarantee it will work, no matter how deep the arrays are.
#! /usr/bin/env perl
use strict;
#use warnings;
use 5.010;
#use YAML;
use JSON;
use Regexp::Grammars;
my $str = '[[,action1,,],[action2],[],[,],[,[],]]';
my $parser = qr{
<match=Array>
<token: Text>
[^,\[\]]*
<token: Element>
(?:
<.Text>
|
<MATCH=Array>
)
<token: Array>
\[
(?:
(?{ $MATCH = [qw'']; })
|
<[MATCH=Element]> ** (,)
)
\]
}x;
if( $str =~ $parser ){
say to_json $/{match};
}else{
die $@ if $@;
}
Which outputs.
[["","action1","",""],["action2"],[],["",""],["",[],""]]
If you really wanted YAML, just un comment "use YAML;
", and replace to_json()
with Dump()
---
-
- ''
- action1
- ''
- ''
-
- action2
- []
-
- ''
- ''
-
- ''
- []
- ''
Upvotes: 3
Reputation: 75232
Try this:
s.gsub(/([\[,])(?=[,\]])/, "\\1''")
.gsub(/([\[,])(?=[^'\[])|([^\]'])(?=[,\]])/, "\\+'");
EDIT: I'm not sure about the replacement syntax. That's supposed to be group #1 in the first gsub
, and the highest-numbered participating group -- $+
-- in the second.
Upvotes: 1
Reputation: 4814
This does the job for the empty fields (ruby1.9):
s.gsub(/(?<=[\[,])(?=[,\]])/, "''")
Or for ruby1.8, which doesn't support zero-width look-behind:
s.gsub(/([\[,])(?=[,\]])/, "\\1''")
Quoting non-empty fields can be done with one of these:
s.gsub(/(?<=[\[,])\b|\b(?=[,\]])/, "'")
s.gsub(/(\w+)/, "'\\1'")
In the above I'm making use of zero-width positive look behind and zero-width positive look ahead assertions (the '(?<=' and '(?=').
I've looked for some ruby specific documentation but could not find anything that explains these features in particular. Instead, please let me refer you to perlre.
Upvotes: 4