NixyCron
NixyCron

Reputation: 61

Remove specific number of digits using sed

John, 1234567
Bob, 2839211
Alex, 2817821
Mary, 9371281

I am currently trying to retrieve the first column with the last 4 digits of the second column using sed, so the output should look like this:

John, 4567
Bob, 9211
Alex, 7821
Mary, 1281

This is my command: 's/\(.*,\)\(.*\)//', I think that this command matches the first column until the comma and the second column until the end, but I am unsure on how to continue.

Upvotes: 4

Views: 1627

Answers (5)

fpmurphy
fpmurphy

Reputation: 2537

Similar to KamilCuk's answer except uses a POSIX character class and anchors the digits to be removed:

sed  's/, [[:digit:]]\{3\}/, /'

Upvotes: 1

RavinderSingh13
RavinderSingh13

Reputation: 133428

In case you are ok with awk, could you please try following. Written and tested with shown samples in GNU awk.

awk 'BEGIN{FS=OFS=", "} {$2=substr($2,length($2)-3)} 1' Input_file

Explanation: Adding detailed explanation for above.

awk '                           ##Starting awk program from here.
BEGIN{                          ##Starting BEGIN section of this program from here.
  FS=OFS=", "                   ##Setting FS and OFS to comma space here.
}
{
  $2=substr($2,length($2)-3)    ##Getting last 4 digits now in 2nd field here.
}
1                               ##printing current edited/non-edited line.
' Input_file                    ##Mentioning Input_file name here.


2nd solution: Adding 1 more solution in case your 2nd column can have mix of digits and other non digits then following may help you.

awk 'BEGIN{FS=OFS=", "} {gsub(/[^0-9]+/,"",$2);$2=substr($2,length($2)-3)} 1' Input_file

Explanation: Adding detailed explanation for above.

awk '                          ##Starting awk program from here.
BEGIN{                         ##Starting BEGIN section of this program from here.
  FS=OFS=", "                  ##Setting FS and OFS to comma space here.
}
{
  gsub(/[^0-9]+/,"",$2)        ##Globally substituting everything apart from digits with NULL in 2nd field.
  $2=substr($2,length($2)-3)   ##getting last 4 digits now in 2nd field here.
}
1                              ##printing current edited/non-edited line.
' Input_file                   ##Mentioning Input_file name here.

Upvotes: 2

KamilCuk
KamilCuk

Reputation: 140880

If the file format is just <text only alphanumeric characters>, <number exactly 7 digits>, you can just remove first 3 digits there are:

sed 's/[0-9][0-9][0-9]//'

Upvotes: 0

Shawn
Shawn

Reputation: 52336

Just capture the last four digits of each line and delete any preceding digits:

$ sed 's/[0-9]*\([0-9]\{4\}\)$/\1/' input.txt
John, 4567
Bob, 9211
Alex, 7821
Mary, 1281

If using a version of sed that supports POSIX Extended Regular Expressions, it can be cleaned up a bit to

sed -E 's/[0-9]*([0-9]{4})$/\1/' input.txt

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

You can use

sed 's/^\([^,]*\), *[0-9]*\([0-9]\{4\}\).*/\1, \2/' file

See the online demo.

Details

  • ^ - start of string
  • \([^,]*\) - Group 1: any zero or more chars other than a comma
  • , * - a comma and zero or more spaces
  • [0-9]* - zero or more digits
  • \([0-9]\{4\}\) - Group 2: four digits
  • .* - the rest of the line
  • \1, \2 - The replacement is: Group 1, ,, space and Group 2 value.

Upvotes: 2

Related Questions