Reputation: 19
I have been struggling with this all day. Trying to make variables in sections of a line only contained within braces.
Lines look like this:
blah blah [ae b c] blah [zv y] blah
I need to make this:
blah blah [$ae $b $c] blah [$zv $y] blah
There must be an easy way to do this. However, whenever I try
$ echo "blah blah [ae b c] blah [zv y] blah" | sed 's/\[\(\b.*\b\)\]/$\1/g'
I get greedy matching and just one variable:
blah blah $ae b c] blah [zv y blah
Is there something better? Thanks,
Upvotes: 1
Views: 85
Reputation: 58351
This might work for you (GNU sed):
sed -r 'h;s/\</$/g;T;G;s/^/\n/;:a;s/\n[^[]*(\[[^]]*\])(.*\n)([^[]*)[^]]*\]/\3\1\n\2/;ta;s/\n(.*)\n(.*)/\2/' file
Make a copy of the current line. Insert $
infront of all start-of-word boundaries. If nothing is substituted print the current line and bale out. Otherwise append the copy of the unadulterated line and insert a newline at the start of the adulterated current line. Using substitution and pattern matching replace the parts of the line between [...]
with the original matching parts using the newline to move the match forwards through the line. When all matches have been made replace the end of the original line and remove the newlines.
Upvotes: 0
Reputation: 10039
sed 's/\[\([^]]*\)\]/[ \1]/g
:loop
s/\(\(\[[^]$]*\)\([[:blank:]]\)\)\([^][:blank:]$][^]]*\]\)/\1\$\4/g
t loop
s/\[ \([^]]*\)\]/[\1]/g' YourFile
[a b[c] d ]
$
in front of last word between bracket that does not have one (not starting by $
). Do it for each bracket group in line, but 1 add per group onlyUpvotes: 0
Reputation: 113814
$ echo "blah blah [ae b c] blah [zv y] blah" | sed -r ':b; s/([[][^]$]* )([[:alnum:]]+)/\1$\2/g; t b; s/[[]([[:alnum:]])/[$\1/g'
blah blah [$ae $b $c] blah [$zv $y] blah
-r
This turns on extended regex.
:b
This creates a label b
.
s/([[][^]$]* )([[:alnum:]]+)/\1$\2/g
This looks for [
, followed by anything except ]
or $
, followed by a space, followed by any alphanumeric characters. It puts a $
in front of the alphanumeric characters.
Note that awk convention that makes [[]
match [
while [^]$]
matches anything except ]
and $
. This is more portable than attempting to escape these characters with backslashes.
t b
If the command above resulted in a substitution, this branches back to label b
so that the substitution is attempted again.
s/[[]([[:alnum:]])/[$\1/g
The last step is to look for [
followed by an alphanumeric character and put a $
between them.
Because [[:alnum:]]
is used, this code is unicode-safe.
On BSD sed (OSX) limits the ability to combine statements with semicolons. Try this instead:
sed -E -e ':b' -e 's/([[][^]$]* )([[:alnum:]]+)/\1$\2/g' -e 't b' -e 's/[[]([[:alnum:]])/[$\1/g'
Upvotes: 2
Reputation: 36252
It's difficult to solve it using sed. As alternative, you can use perl with the help of the Text::Balanced
module, that extracts text between balanced delimiters, like square brackets. Each call returns an array with the content between delimiters, the text before them and the text after them, so you can apply the regex that insert $
sign to the significative part of the string.
perl -MText::Balanced=extract_bracketed -lne '
BEGIN { $result = q||; }
do {
@result = extract_bracketed($_, q{[]}, q{[^[]*});
if (! defined $result[0]) {
$result .= $result[1];
last;
}
$result[0] =~ s/(\[|\s+)/$1\$/g;
$result .= $result[2] . $result[0];
$_ = $result[1];
} while (1);
END { printf qq|%s\n|, $result; }
' infile
It yields:
blah blah [$ae $b $c] blah [$zv $y] blah
Upvotes: 0
Reputation: 9609
To disable it being greedy, instead of matching any character, match any character except closing bracket:
sed 's/\[\(\b[^]]*\b\)\]/$\1/g'
The task you want to do cannot be done with sed because context-sensitive matching cannot be described with regular grammar.
Upvotes: 0