Rodriguez J Mathew
Rodriguez J Mathew

Reputation: 107

How to remove consecutive repeating characters from every line?

I have the below lines in a file

Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;;Profilicollis;Profilicollis_altmani;
Acanthocephala;Eoacanthocephala;Neoechinorhynchida;Neoechinorhynchidae;;;;
Acanthocephala;;;;;;;
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;;Polymorphus;;

and I want to remove the repeating semi-colon characters from all lines to look like below (note- there are repeating semi-colons in the middle of some of the above lines too)

Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;Profilicollis;Profilicollis_altmani;
Acanthocephala;Eoacanthocephala;Neoechinorhynchida;Neoechinorhynchidae;
Acanthocephala;
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;Polymorphus;

I would appreciate if someone could kindly share a bash one-liner to accomplish this.

Upvotes: 0

Views: 84

Answers (5)

wjandrea
wjandrea

Reputation: 32997

Here's a sed version of alaniwi's answer:

sed 's/;\+/;/g' myfile  # Write output to stdout

or

sed -i 's/;\+/;/g' myfile  # Edit the file in-place

Upvotes: 0

Benjamin W.
Benjamin W.

Reputation: 52162

You can use tr with "squeeze":

tr -s ';' < infile

Upvotes: 2

Kent
Kent

Reputation: 195089

could be solved easily by substitutions. I add an awk solution by playing with the FS/OFS variable:

awk -F';+' -v OFS=';' '$1=$1' file

or

awk -F';+' -v OFS=';' '($1=$1)||1' file

Upvotes: 0

alani
alani

Reputation: 13079

perl -p -e 's/;+/;/g' myfile   # writes output to stdout

or

perl -p -i -e 's/;+/;/g' myfile   # does an in-place edit

Upvotes: 2

Shawn
Shawn

Reputation: 52439

If you want to edit the file itself:

printf "%s\n" 'g/;;/s/;\{2,\}/;/g' w | ed -s foo.txt

If you want to pipe a modified copy of the file to something else and leave the original unchanged:

sed 's/;\{2,\}/;/g' foo.txt | whatever

These replace runs of 2 or more semicolons with single ones.

Upvotes: 0

Related Questions