Reputation: 63
So, I am trying to look for certain words in the 5th field of /etc/passwd. For example:
jonesc:x:1053:1001:Cathy Jones:/export/home/jonesc:/bin/ksh
smiths:x:1049:1000:Sue Williams:/export/home/smiths:/bin/csh
smitha:x:1050:1001:Amy Smith:/export/home/smitha:/bin/bash
Lets say I am looking for the word 'Smith'? How would I look for it ONLY in the 5th field that contains the names, as opposed to looking through the entire line?
I can easily do this with awk, but I am asked to do this with sed instead.
What I'm asked to do is to output matches from /etc/passwd that contain Smith or Jones in the 5th field to a file called smith_jones.txt.
I have no problem with writing output to file with sed, I am just stuck with how I am supposed to look for only in the 5th field. Awk would use $5, but I cannot find something similar with sed.
Not looking for a complete answer being handed to me, but rather a push in the right direction.
Upvotes: 1
Views: 2360
Reputation: 2868
Give a try to this:
sed -n ":1
/^[^:]*:[^:]*:[^:]*:[^:]*:[^:]*Smith[^:]*:.*$/ {p
n
b1}
/^[^:]*:[^:]*:[^:]*:[^:]*:[^:]*Jones[^:]*:.*$/{p}"
-n
instructs sed
to not print anything
:1
defines a label
/^[^:]*:[^:]*:[^:]*:[^:]*:[^:]*Smith[^:]*:.*$/
regex matches any string that contains Smith
in the 5th field, where fields are separated with :
.
p
is a command that prints the current line.
n
is a command that loads the next line into the buffer.
b1
goto label 1
sed
reads the file one line at a time. The current line is stored into the buffer. IfSmith
is found in the 5th field the line is printed and the next line is stored into the buffer and it goes to label 1. Otherwise, if Jones
is found in the 5th field then the line in the buffer is printed.
The test:
$ sed -n ":1
/^[^:]*:[^:]*:[^:]*:[^:]*:[^:]*Smith[^:]*:.*$/ {p
n
b1}
/^[^:]*:[^:]*:[^:]*:[^:]*:[^:]*Jones[^:]*:.*$/{p}" /etc/passwd >> smith_jones.txt
$ cat smith_jones.txt
jonesc:x:1053:1001:Cathy Jones:/export/home/jonesc:/bin/ksh
smitha:x:1050:1001:Amy Smith:/export/home/smitha:/bin/bash
Upvotes: 0
Reputation: 47119
Awk would be the right tool for the job:
awk '$5 ~ /smith|jones/{print}' /etc/passwd > output.txt
But since you are asking for a sed solution then you can use something like this:
sed -n '/[^:]*:[^:]*:[^:]*:[^:]*:\(smith\|jones\)/p' /etc/passwd
Where each [^:]*
will match everything but :
zero or more times.
You can also repeat a previous pattern with the range meta sequence: \{x,y\}
:
sed -n '/\([^:]*:\)\{4\}\(smith\|jones\)/p' /etc/passwd
As you can see this will help you simplify your regex even more.
-n
is for no print by default and /pattern/p
will print everything matching pattern
You might want to add another [^:]*
before \(smith\|jones\)
if you want to match the middle of the user name, eg:
sed -n '/\([^:]*:\)\{4\}[^:]*\(th\|es\)/p' /etc/passwd
Will match Smith
and Jones
.
As pointed out in the comments you can also use Extended Regular Expressions to avoid all those backslashes:
sed -E -n '/([^:]*:){4}(smith|jones)/p' /etc/passwd
Traditionally GNU sed used -r
to enable ERE and BSD sed uses -E
. GNU sed however support the -E
flag even though it's undocumented.
Upvotes: 5
Reputation: 22438
This should work:
sed -n '/^\([^:]*:\)\{4\}[^:]*\(Jones\|Smith\)/p' /etc/passwd
^\([^:]*:\)\{4\}
matches the first four fields delimited with :
, and thus the fifth field is matched against the names (Jones and Smith).
Upvotes: 0