Arild Noven
Arild Noven

Reputation: 15

How to match a string not followed by a word using sed

I need to delete all strings consisting of a hyphen followed by a whitespace, but only when the whitespace is not followed by the word "og". Example file:

Kultur- og idrettsavdelinga skapar nyska- pande kunst og utvik- lar samfunnet

I tried negative lookahead :

sed -e 's/- (?!og)//g'

but it doesn't work. What I want is something like this:

Kultur- og idrettsavdelinga skapar nyskapande kunst og utviklar samfunnet.

Any ideas?

Upvotes: 1

Views: 1997

Answers (4)

potong
potong

Reputation: 58430

This might work for you (GNU sed):

sed -r 's/(- (og|eller))|- /\1/g' file

This relies on alternation to re-replace specific cases and the empty backreference to replace the general case.

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203665

Given this input file (I added - ellers since you said in a comment you need to handle them too):

$ cat file
Kultur- og idrettsavdelinga skapar- eller nyska- pande kunst og utvik- lar- eller samfunnet

here's the common sed idiomatic approach:

$ sed 's/a/aA/g; s/- og/aB/g; s/- eller/aC/g; s/- //g; s/aC/- eller/g; s/aB/- og/g; s/aA/a/g' file
Kultur- og idrettsavdelinga skapar- eller nyskapande kunst og utviklar- eller samfunnet

The above works by turning all as (or whatever other char you like that's not in your target strings) into aA so we can then turn the strings we're interested in, - og and - eller, into a<some other character>, e.g. aB and aC and at that point we know the only occurrences of aB and aC in the input are the newly transformed - og and - eller since all of the existing as are now aA.

Now we can just remove all remaining -s from the file and then convert the aCs back to - eller and aBs back to - ogs and finally all aAs back to the original as.

Upvotes: 1

Jedi
Jedi

Reputation: 3358

You can also use a sed chain, first replacing - og with something nonsensical (like booogabooga), then performing the replacement, then reversing the booogabooga.

sed -e 's/- og/booogabooga/g; s/- //g; s/booogabooga/- og/g'

Some versions of sed may need:

sed -e 's/- og/booogabooga/g' -e 's/- //g' -e 's/booogabooga/- og/g'

This can be slower and more painful, especially if you have multiple replacements as @Kusalananda suggests, but it is easier to understand.

Upvotes: 1

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

The lookahead feature isn't available with sed, but you can describe all possibilities:

sed -e 's/\(- \(- \)*\)\([^o]\|$\|o\([^g]\|$\)\)/\3/g'

You can test it with: - - - - og - - oa - o => - og oa o

Upvotes: 1

Related Questions