tquid
tquid

Reputation: 321

Regex to capture a group of delimited words that must end with a specific word

I'm normalizing a bunch of Ansible group names, which have to change to use underscores instead of hyphens (thanks, Ansible). However, there's tons of other stuff in the file that is hyphenated, so I want to leave those lines alone. The ones I want to change always end with -servers. So, with a small sample, we might have:

foo-bar
foo-bar-servers
foo-bar-baz-servers

(\w)-(\w?)? very nicely captures things so I can just sub to $1_$2 to change the hyphens to underscores. However, as soon as I add -servers or ervers on the end, it grabs only the very last pair around the hyphen. I have tried many variations, read up a little on lookaheads, and I am thoroughly stumped. It seems like it ought to be simple. What is the magic incantation to match all the groups around the hyphens, for lines ending in -servers? Many thanks in advance.

Edit: desired results, with apologies:

foo-bar
foo_bar_servers
foo_bar_baz_servers

Upvotes: 0

Views: 51

Answers (2)

user3408541
user3408541

Reputation: 71

Here is a solution in Perl. First check if -servers is at the end of the line, and if so do a search and replace to change all hyphens to underscores.

Here is the code...

#!/usr/bin/perl -w

while(<>){
  if(/-servers$/){
    s/-/_/g;
  }
  print;
}

Output looks like this...

$ perl replace.hyphens.with.underscores.pl replace.hyphens.with.underscores.txt 
foo-bar
foo_bar_servers
foo_bar_baz_servers

Golfed at 22 characters

$ perl -pe 's/-/_/g if(/-servers$/)' replace.hyphens.with.underscores.txt

Upvotes: 0

Cary Swoveland
Cary Swoveland

Reputation: 110735

As long as your regex engine supports positive lookaheads and (fixed-length) positive lookbehinds (as do most engines, including PCRE (PHP) and Python, for example), you may use the following regular expression to match the desired hyphens, which may then be replaced with underscores.

(?<=\w)-(?=(?:\w+-)*servers$)

Demo

The regex engine performs the following operations.

(?<=\w)       match a word char in a positive lookbehind
-             match a hypen
(?=           begin a positive lookahead
  (?:\w+-)    match 1+ word chars then '-', in a non-capture group
  *           execute non-capture group 0+ times
  servers     match string
  $           match end of line
)             end positive lookahead

Upvotes: 2

Related Questions