Dave Jarvis
Dave Jarvis

Reputation: 31191

Combine regex patterns to match groups inside delimited strings

Background

Looking to replace periods with dollar signs only within text that is delimited by dollar signs (never spanning lines). For example:

Names: $annie.bettie.cindy.dannie.ellie$. Only $a$ names. $a.b.c.d.e.f$.

Problem

The following regex almost works, but is too simple:

/([[:alnum:]])\.([[:alnum:]])/g

If a match exists outside of the delimiters ($), then too much will be replaced.

The following regex:

/\$.*?\$/g

Matches and groups the delimited strings:

Names: $annie.bettie.cindy.dannie.ellie$. Only $a$ names. $a.b.c.d.e.f$.

Question

How do I combine the two regular expressions so that the periods can be replaced with another string? For example:

Names: $annie.bettie.cindy.dannie.ellie$. Only $a$ names. $a.b.c.d.e.f$.

Ultimately will become:

Names: `r v$annie$bettie$cindy$dannie$ellie`. Only `r v$a` names. `r v$a$b$c$d$e$f`.

The trouble I'm having is matching the delimited dots.

The regular expression will be piped into sed from a terminal running bash.

Upvotes: 1

Views: 206

Answers (2)

Sundeep
Sundeep

Reputation: 23677

$ cat ip.txt 
Names: $annie.bettie.cindy.dannie.ellie$. Only $a$ names. $a.b.c.d.e.f$.

$ perl -pe '
BEGIN
{
    sub f
    {
        $a = $_[0] =~ tr/./$/r;
        $a =~ s/^/`r v/;
        $a =~ s/.$/`/;
        return $a;
    }
}
s/\$.*?\$/f($&)/ge
' ip.txt
Names: `r v$annie$bettie$cindy$dannie$ellie`. Only `r v$a` names. `r v$a$b$c$d$e$f`.
  • The subroutine f performs the necessary transformation for $sometext$ strings - first transliterate . to $, then add some string to beginning and finally remove last character to replace with required format
    • The subroutine is put in a BEGIN block, which is executed before processing the input file line by line
  • s/\$.*?\$/f($&)/ge will extract the $sometext$ pattern and pass on to f subroutine. Perl knows to call it courtesy the e flag
  • -p switch means input line gets printed after all commands

Upvotes: 1

potong
potong

Reputation: 58473

This might work for you (GNU sed):

sed -r ':a;s/^(([^$]*\$[^$.]*\$)*[^$]*\$[^$.]*)\./\1\n/;ta;s/(\$[^$]*)\$/`r v\1`/g;y/\n/$/' file

Replace all periods within the groups by a newlines. Insert groups prefix and suffix literals and then translate the newlines to dollars.

Upvotes: 1

Related Questions