Reputation: 83
#!/bin/bash
echo "the first application of sed"
sed -e 's/^\([0-9]\{3\}\)/(\1)/' s.txt
echo "the second application of sed"
sed -e 's/^\([0-9]\{3\}\)/(\1\+\1)/' s.txt
echo "see the original file"
cat s.txt
the first application of sed
(905)-123-3456
(905)-124-3456
(905)-125-3456
(905)-126-3456
(905)-127-3456
the second application of sed
(905+905)-123-3456
(905+905)-124-3456
(905+905)-125-3456
(905+905)-126-3456
(905+905)-127-3456
see the original file
905-123-3456
905-124-3456
905-125-3456
905-126-3456
905-127-3456
I'm just starting out in shell programming and for the last 2 hours I'm stuck with this code. I know the basic usage of sed but I cannot figure out what the line
sed -e 's/^\([0-9]\{3\}\)/(\1)/' s.txt
does. I know -e is expression, s is substitute. ^ indicates beginning of line but the part after that is confusing. Any ideas?
Upvotes: 0
Views: 45
Reputation: 37069
Let's break this down:
sed -e 's/^\([0-9]\{3\}\)/(\1)/' s.txt
The nomenclature of sed's substitute is like this:
s/search/replace/options
In your case, search
part is ^\([0-9]\{3\}\)
. Parenthesis and curly brackets can have special meaning and they are escaped by a \
. If we remove them for understanding purposes, this is how it will look:
^([0-9]{3})
It means - the line should start with a number between 0 and 9 and it should be repeated 3 times. So basically, it's a 3 digit number (e.g. 123, 543 etc.).
The parenthesis () groups the 3 digit number, which can be referred to as the first group.
The replace part of it is (\1)
. That means, the group we captured in search is regurgitated.
Upvotes: 2
Reputation: 753990
Ultimately, it is manual-bashing exercise.
\(
marks the start of a capture, up to the balanced \)
— they can be nested, though these ones don't.\{
marks the start of a repeat specification up to the following \}
— they cannot be nested. In this case, you have \{3\}
so this repeats the previous item, [0-9]
, three times.\1
in the replacement refers the material captured by the first \(
in the search pattern.Hence:
s/^\([0-9]\{3\}\)/(\1)/
wraps the three digits at the start of the line in parentheses — as shown in your output. Because it is anchored, it happens just once. If a line doesn't start with three digits, nothing happens to that line as a result of this command.
The second example is only marginally different. It takes the sequence of three digits at the start of the line and replaces it with that sequence, a +
mark, and the sequence again, all wrapped in parentheses — as shown in your output.
There are relatively few metacharacters in the replacement part of a s///
command; there are a lot of metacharacters in the search part. Further, there are different dialects in the search part — some variants of sed
support 'extended regular expressions' instead of 'basic regular expressions' (which is what your example uses); others support Perl-like expressions (not quite the full PCRE — Perl Compatible Regular Expressions — as far as I know, but some notations from PCRE). For that, you need to read the manual for the sed
you're using.
Upvotes: 3