Craig Francis
Craig Francis

Reputation: 1935

RegEx, Substituting a variable number of replacements

Hopefully I'm missing something obvious.

I've got a file that contains some lines like:

| A | B | C |
|-----------|
Ignore this line
| And | Ignore | This |
| D | E | F | G |
|---------------|

I want to find the |----| lines, remove those... and replace all of the | characters with a ^ in the preceding line. e.g.

^ A ^ B ^ C ^
Ignore this line
| And | Ignore | This |
^ D ^ E ^ F ^ G ^

So far I've got:

perl -0pe 's/^(\|.*\|)\n\|-+\|/$1/mg'

This takes input from stdin (some other modifications have already happened with sed)... and it's using -0 and /m to support multiline replacements.

The match seems to be correct, and it removes the |----| lines, but I can't see how I can do the | to ^ substitution with the $1 (or \1) backreference.

I can't remember where I did it before, but another language allowed me to use ${1/A/B} to substitute A to B, but that's upsetting perl.

And I've been wondering if this is where /e or /ee could be used, but I'm not familiar enough with perl on how to do that.

Upvotes: 5

Views: 142

Answers (3)

MonkeyZeus
MonkeyZeus

Reputation: 20737

I could see this being done using 2 substitutions:

\|(?=.*[\r\n]+\|-+\|$)

https://regex101.com/r/x7d15d/1/

And then:

^\|-+\|(?:[\r\n]+|$)

https://regex101.com/r/ZdEzuM/1/

Upvotes: 2

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

With one pattern that checks the next line in a lookahead assertion:

perl -0pe 's/\|(?=.*\R\|-+\|$)(?:\R.*)?/^/gm' file

If you absolutely want to use an evaluation, you can put a transliteration in the replacement part with this pattern:

perl -0pe 's#^(.*)\R\|-+\|$#$1=~y/|/^/r#gme' file

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You can use

perl -0pe 's{^(.*)\R\|-+\|$\R?}{$1 =~ s,\|,^,gr}gme' t

Details:

  • ^(.*)\R\|-+\|$\R? - matches all occurrences (see the g flag at the end)
    • ^ - start of a line (note the m flag that makes ^ match start of a line and $ match end of a line)
    • (.*) - Group 1: whole line
    • \R - a line break sequence
    • \| - | char
    • -+ - one or more - chars
    • \| - a | char
    • $ - end of line
    • \R? - an optional line break sequence.

Once the match is found, all | are replaced with ^ using $1 =~ s,\|,^,gr, that replaces inside the Group 1 value. This syntax is enabled with the e flag.

Upvotes: 7

Related Questions