andrew_j
andrew_j

Reputation: 41

Find and replace strings in file using regex in sed. Script doesn't work

I have a file, in which I have to find and change strings by specific pattern (phone number). The regex is:

^\+[0-9]{3} \([0-9]{2}\) [0-9]{7}$

When I use it in command:

grep "^\+[0-9]{3} \([0-9]{2}\) [0-9]{7}$" -E filename

It works. But when I try to use it in sed to replace all parenthesis by spaces and add spaces in 13 and 15 position, it doesn't works and I don't have ideas why.

My variants are:

sed '/^\+[0-9]{3} \([0-9]{2}\) [0-9]{7}$/s/[()]//' filename

(only for replacing parenthesis)

sed -e '/^\+[0-9]{3} \([0-9]{2}\) [0-9]{7}$/s/[()]//' -e '/^+[0-9]{2} ([0-9]{2}) [0-9]{7}/s/./& /11;s/./& /14' filename

file structure:

    +380 44 123 45 67
    +380 (44) 1234567
    +350 (56) 1454557
    +330 (76) 1255557
    +380 44 3534 45 67
    +320 (45) 1237887
    +310 (54) 1939997
    adasd
    asdddddddddddd
    sssdad

expected output:

    +380 44 123 45 67
    +380 44 123 45 67
    +350 56 145 45 57
    +330 76 125 55 57
    +380 44 3534 45 67
    +320 45 123 78 87
    +310 54 193 99 97
    adasd
    asdddddddddddd
    sssdad

Upvotes: 1

Views: 123

Answers (3)

Sundeep
Sundeep

Reputation: 23697

Here's one way to do it:

$ cat ip.txt 
+380 44 123 45 67
+380 (44) 1234567
+350 (56) 1454557
+330 (76) 1255557
+380 44 3534 45 67
+320 (45) 1237887
+310 (54) 1939997
adasd
asdddddddddddd
sssdad

$ sed -E 's/^(\+[0-9]{3}) \(([0-9]{2})\) ([0-9]{3})([0-9]{2})([0-9]{2})$/\1 \2 \3 \4 \5/' ip.txt 
+380 44 123 45 67
+380 44 123 45 67
+350 56 145 45 57
+330 76 125 55 57
+380 44 3534 45 67
+320 45 123 78 87
+310 54 193 99 97
adasd
asdddddddddddd
sssdad
  • () can be used to surround a pattern so that the matched text inside them can be backreferenced in replacement section
  • \1 corresponds to first such captured group, \2 to second and so on
  • To match ( or ) themselves, we need to use escape them like \( and \)
  • So, here the numbers are captured as per required output, excluding the () present in input line so that they are not part of output

Upvotes: 1

Jan Nielsen
Jan Nielsen

Reputation: 11849

Use:

sed -e 's|[()]||g' so-tel.txt | sed -E 's|([0-9]{3})([0-9]{2})([0-9]{2})|\1 \2 \3|'

to transform so-tel.txt:

+380 44 123 45 67
+380 (44) 1234567
+350 (56) 1454557
+330 (76) 1255557
+380 44 3534 45 67
+320 (45) 1237887
+310 (54) 1939997
adasd
asdddddddddddd
sssdad

into:

+380 44 123 45 67
+380 44 123 45 67
+350 56 145 45 57
+330 76 125 55 57
+380 44 3534 45 67
+320 45 123 78 87
+310 54 193 99 97
adasd
asdddddddddddd
sssdad

Explanation:

's|[()]||g'

substitute any ( and ) with nothing, globally

's|([0-9]{3})([0-9]{2})([0-9]{2})|\1 \2 \3|'

substitute and capture seven successive digits in lengths 3, 2, and 2, with the captured digit groups separated by a space.

Upvotes: 0

Smithson
Smithson

Reputation: 141

Your sed command is wrong. My way:

sed -E 's/^\+[0-9]{3} \([0-9]{2}\) [0-9]{7}$/[()]/'

Upvotes: 0

Related Questions