Vijayraj S
Vijayraj S

Reputation: 81

Using SED/AWK to replace letters after a certain position

I have a file with words (1 word per line). I need to censor all letters in the word, except the first five, with a *.

Ex.

Authority -> Autho****

I'm not very sure how to do this.

Upvotes: 2

Views: 3757

Answers (7)

ghoti
ghoti

Reputation: 46876

Here's a portable solution for sed:

$ echo abcdefghi | sed -e 's/\(.\{5\}\)./\1*/;:x' -e 's/\*[a-z]/**/;t x'
abcde****

Here's how it works:

  • 's/\(.\{5\}\)./\1*/' - preserve the first five characters, replacing the 6th with an asterisk.
  • ':x' - set a "label", which we can branch back to later.
  • 's/\*[a-z]/**/ - ' - substitute the letter following an asterisk with an asterisk.
  • 't x' - if the last substitution succeeded, jump back to the label "x".

This works equally well in GNU and BSD sed.

Of course, adjust the regexes to suit.

Upvotes: 3

Beta
Beta

Reputation: 99144

Here is a pretty straightforward sed solution (that does not require GNUsed):

sed -e :a -e 's/^\(.....\**\)[^*]/\1*/;ta' filename

Upvotes: 0

tripleee
tripleee

Reputation: 189689

If you are lucky, all you need is

sed 's/./*/6g' file

When I originally posted this, I believed this to be reasonably portable; but as per @ghoti's comment, it is not.

Upvotes: 4

Ed Morton
Ed Morton

Reputation: 204164

Personally I'd just use sed for this (see @triplee's answer) but if you want to do it in awk it'd be:

$ awk '{t=substr($0,1,5); gsub(/./,"*"); print t substr($0,6)}' file
Autho****

or with GNU awk for gensub():

$ awk '{print substr($0,1,5) gensub(/./,"*","g",substr($0,6))}' file
Autho****

Upvotes: 1

Allan
Allan

Reputation: 12448

It is also possible and quite straightforward with sed:

sed 's/./\*/6;:loop;s/\*[^\*]/\**/;/\*[^\*]/b loop' file_to_censor.txt 

output:

enter image description here

explanation:

s/./\*/6           #replace the 6th character of the chain by *
:loop              #define an label for the goto
s/\*[^\*]/\**/     #replace * followed by non * char by **
/\*[^\*]/b loop    #then loop until it does not exist a * followed by a non * char

Upvotes: 0

RavinderSingh13
RavinderSingh13

Reputation: 133650

Following awk may help you in same.

Solution 1st: awk solution with substr and gensub.

awk '{print substr($0,1,5) gensub(/./,"*","g",substr($0,6))}'  Input_file

Solution 2nd:

awk 'NF{len=length($0);if(len>5){i=6;while(i<=len){val=val?val "*":"*";i++};print substr($0,1,5) val};val=i=""}'  Input_file
Autho****

EDIT: Adding a non-one liner form of solution too now. Adding explanation with it too now.

awk '
NF{                         ##Checking if a line is NON-empty.
  len=length($0);           ##Taking length of the current line into a variable called len here.
  if(len>5){                ##Checking if length of current line is greater than 5 as per OP request. If yes then do following.
    i=6;                    ##creating variable named i whose value is 6 here.
    while(i<=len){          ##staring a while loop here which runs from value of variable named i value to till the length of current line.
      val=val?val "*":"*";  ##creating variable named val here whose value will be concatenated to its own value, it will add * to its value each time.
      i++                   ##incrementing variable named i value with 1 each time.
};
    print substr($0,1,5) val##printing value of substring from 1st letter to 5th letter and then printing value of variable val here too.
};
val=i=""                    ##Nullifying values of variable val and i here too.
}
' Input_file                ##Mentioning Input_file name here.

Upvotes: 2

choroba
choroba

Reputation: 241998

Perl to the rescue:

perl -pe 'substr($_, 5) =~ s/./*/g' -- file
  • -p reads the input line by line and prints each line after processing
  • substr returns a substring of the given string starting at the given position.
  • s/./*/g replaces any character with an asterisk. The g means the substitution will happen as many times as possible, not just once, so all the characters will be replaced.

In some versions of sed, you can specify which substitution should happen by appending a number to the operation:

sed -e 's/./*/g6'

This will replace all (again, because of g) characters, starting from the 6th position.

Upvotes: 3

Related Questions