Sumit
Sumit

Reputation: 2023

Sed pattern is not substituting into desired result

I am trying hackers rank problem for sed. I tried to write my own solution but it did not worked. I am unable to figure out why my solution is not working

Sample Problem: A file containing credit card number say "4321 5667 8765 1234". I have to change this pattern to "**** **** **** 1234"

Sed pattern I have written is

sed 's/([0-9]{4}) ([0-9]{4}) ([0-9]{4}) ([0-9]{4})/**** **** **** \4/' sample_data 

It is giving output as

4321 5667 8765 1234

It seems like sed is not matching pattern and that is the reason it is printing string as it is

I know some smaller solutions like

sed 's/[^ ]* /****/g'

This is working

I again tried

sed 's/[^ ]+ /****/g' # replaced with * with +

It is not matching any pattern.

Upvotes: 0

Views: 93

Answers (2)

David C. Rankin
David C. Rankin

Reputation: 84561

A slightly shorter option using global replacement can be written as:

sed -E 's/[0-9]{4}\s+/**** /g'

Which uses the extended regex to match:

  • [0-9]{4}\s+ digits {four of them} and at least one whitespace; and
  • replace them with "**** "

An equivalent (but longer) basic regex would be:

sed 's/[0-9][0-9][0-9][0-9]\s\s*/**** /g'

Where each of the digits is listed expressly and \s\s* matches one or more whitespace with the same replacement applied. BRE doesn't support the {4} pattern repetition or + that ERE does.

Also, since hackerrank is heavy on tripping you up with corner-cases, you may want to trim leading and trailing whitespace before you process the numbers, e.g.

sed -e 's/^\s*//' -e 's/\s*$//' -e 's/[0-9][0-9][0-9][0-9]\s\s*/**** /g'

That way you can also handle lines like:

"  4321 5667 8765 1234  "

Upvotes: 1

ashish_k
ashish_k

Reputation: 1581

There are multiple issues with your sed command:

  1. Not escaping () and {} while using sed or else use -E or -r for extended regular expressions, as given in man sed:

    -E, -r, --regexp-extended

    use extended regular expressions in the script.

  2. Missing the printing of the pattern space.

    p

    Print the current pattern space.

Also, there's no need to capture the first 3 number groups.

sed command:

sed -r -n 's/^[0-9]{4}\s+[0-9]{4}\s+[0-9]{4}\s+([0-9]{4})/**** **** **** \1/p' sample_data

\s :

Matches whitespace characters (spaces and tabs).

Upvotes: 0

Related Questions