Reputation: 81
I have a file with words (1 word per line). I need to censor all letters in the word, except the first five, with a *
.
Ex.
Authority
-> Autho****
I'm not very sure how to do this.
Upvotes: 2
Views: 3757
Reputation: 46876
Here's a portable solution for sed
:
$ echo abcdefghi | sed -e 's/\(.\{5\}\)./\1*/;:x' -e 's/\*[a-z]/**/;t x'
abcde****
Here's how it works:
's/\(.\{5\}\)./\1*/'
- preserve the first five characters, replacing the 6th with an asterisk.':x'
- set a "label", which we can branch back to later.'s/\*[a-z]/**/ - '
- substitute the letter following an asterisk with an asterisk.'t x'
- if the last substitution succeeded, jump back to the label "x".This works equally well in GNU and BSD sed.
Of course, adjust the regexes to suit.
Upvotes: 3
Reputation: 99144
Here is a pretty straightforward sed solution (that does not require GNUsed):
sed -e :a -e 's/^\(.....\**\)[^*]/\1*/;ta' filename
Upvotes: 0
Reputation: 189689
If you are lucky, all you need is
sed 's/./*/6g' file
When I originally posted this, I believed this to be reasonably portable; but as per @ghoti's comment, it is not.
Upvotes: 4
Reputation: 204164
Personally I'd just use sed for this (see @triplee's answer) but if you want to do it in awk it'd be:
$ awk '{t=substr($0,1,5); gsub(/./,"*"); print t substr($0,6)}' file
Autho****
or with GNU awk for gensub():
$ awk '{print substr($0,1,5) gensub(/./,"*","g",substr($0,6))}' file
Autho****
Upvotes: 1
Reputation: 12448
It is also possible and quite straightforward with sed
:
sed 's/./\*/6;:loop;s/\*[^\*]/\**/;/\*[^\*]/b loop' file_to_censor.txt
output:
explanation:
s/./\*/6 #replace the 6th character of the chain by *
:loop #define an label for the goto
s/\*[^\*]/\**/ #replace * followed by non * char by **
/\*[^\*]/b loop #then loop until it does not exist a * followed by a non * char
Upvotes: 0
Reputation: 133650
Following awk
may help you in same.
Solution 1st: awk
solution with substr
and gensub
.
awk '{print substr($0,1,5) gensub(/./,"*","g",substr($0,6))}' Input_file
Solution 2nd:
awk 'NF{len=length($0);if(len>5){i=6;while(i<=len){val=val?val "*":"*";i++};print substr($0,1,5) val};val=i=""}' Input_file
Autho****
EDIT: Adding a non-one liner form of solution too now. Adding explanation with it too now.
awk '
NF{ ##Checking if a line is NON-empty.
len=length($0); ##Taking length of the current line into a variable called len here.
if(len>5){ ##Checking if length of current line is greater than 5 as per OP request. If yes then do following.
i=6; ##creating variable named i whose value is 6 here.
while(i<=len){ ##staring a while loop here which runs from value of variable named i value to till the length of current line.
val=val?val "*":"*"; ##creating variable named val here whose value will be concatenated to its own value, it will add * to its value each time.
i++ ##incrementing variable named i value with 1 each time.
};
print substr($0,1,5) val##printing value of substring from 1st letter to 5th letter and then printing value of variable val here too.
};
val=i="" ##Nullifying values of variable val and i here too.
}
' Input_file ##Mentioning Input_file name here.
Upvotes: 2
Reputation: 241998
Perl to the rescue:
perl -pe 'substr($_, 5) =~ s/./*/g' -- file
-p
reads the input line by line and prints each line after processings/./*/g
replaces any character with an asterisk. The g
means the substitution will happen as many times as possible, not just once, so all the characters will be replaced.In some versions of sed
, you can specify which substitution should happen by appending a number to the operation:
sed -e 's/./*/g6'
This will replace all (again, because of g
) characters, starting from the 6th position.
Upvotes: 3