The Humble Rat
The Humble Rat

Reputation: 4696

Regex remove middle portion of the string with look ahead

I am trying to use sed to replace a few thousand strings I have.

I have strings like ('app','model.whatever.id') or ('app','model.whatever.whateveragain.status') or ('app','model.whatever.type').

I need to replace all instances of these strings to be like so:

('app','model.id')
('app','model.status')
('app','model.type')

A few notes. I only need to match strings that start with model. or whatevermodel., the middle can have 1 or multiple chunks, I need to retain the final piece of information ie id,status etc.

The code I currently have is:

find /var/www/html/test2 -type f -print0 | xargs -0 sed -i '/.*model\..*\./{s//model./g}' 

This seems to work for most examples but in the case of ('app','model.whatever.type'). the final fullstop outside of the parenthesis causes an issue as the parenthesis is removed (I have instanced where the fullstop can occur 350 characters later so large chunks of the lines are being removed.

Forgive me as regex is not my strong suit, but I have attempted to use the following, but I am not getting the desired result. This was meant to match the last occurrence of a fullstop before a parenthesis.

find /var/www/html/test2 -type f -print0 | xargs -0 sed -i '/model\..*(?:(?!^.:[ ])[\s\S])*\\)/{s//model./g}'

Can anyone point me in the right direction as I feel I'm a few tweaks away from what i need.

Upvotes: 0

Views: 1320

Answers (1)

choroba
choroba

Reputation: 241978

I don't know of any sed implementation that supports look-around assertions.

But it seems you don't need them. I'm getting the expected output with much simpler regex:

sed -e 's/model\.[^'\'']*\./model./'

or

sed -e "s/model\.[^']*\./model./"
sed -E 's/(model\.)[^'\'']*\./\1/'
sed -E "s/(model\.)[^']*\./\1/"

Explanation of the tricky part:

  • [ starts a character class.
  • ^ negates the class.
  • ' ends the single quoted string.
  • \' the literal quote. The shell will remove the backslash.
  • ' starts the quoted string again.
  • ] closes the class.
  • * Zero or more times.

So, it's there only to solve shell quoting. What sed gets it the same as the double quoted string below.

Upvotes: 2

Related Questions