Reputation: 127
i wanted to extract the word prior to a pattern from a string in unix.How can i acheive this?
eg: say the string is "sv_z = sample.scr" in the string i have to search for ".scr".If found in the string i have to extract that whole word. In this example the output should be sample.scr. The delimiter to arrive at the word can be balnk space,double quotoes or equal to.
Here's a few more examples:
sv_z=sample.scr
sv_z=urhk_dbCall("sample.scr")
sv_z="sample.scr"
Here's my expected output:
sample.scr
sample.scr
sample.scr
Upvotes: 1
Views: 3814
Reputation: 54392
Here's one way using grep
:
grep -o '[^ "=]*\.scr' file
Explanation:
-o
flag matches the pattern exactly.[ ... ]
is a character class. If a carat (^
) is used as the first character in this class, it is a negation of the class, it effectively means, "none of the following characters".*
says match whatever the last character was, zero or any number of times.EDIT:
Alternatively, if you require more strictness you'll need Perl-regex and a positive lookahead. In the example below, this will ensure that the match is followed by, a double quote, a space or an end of line. Also, you could change the star (*
) into a plus sign (+
) which means match once or more times. So this would filter out things like: .scr
. But it's not clear from your example input exactly what you're looking for here. Good luck.
grep -oP '[^ "=]*\.scr(?=("| |$))' file
Upvotes: 2
Reputation: 1210
Another solution:
awk -F= 'NR==1{print $2}{FS="\""}NR>1{print $2}' file
Upvotes: 0
Reputation: 203502
In this awk script I'm using a variable "d" to contain the list of allowed delimiters to save repeating them multiple times in the script:
$ cat file
sv_z=sample.scr
sv_z=urhk_dbCall("sample.scr")
sv_z="sample.scr"
sv_z="unscrambled"
sv_z="sample.scrambled"
$ awk -v d=' "=' 'match($0,"["d"][^"d"]+\.scr(["d"]|$)") { $0=substr($0,RSTART,RLENGTH); gsub("["d"]",""); print NR, $0 }' file
1 sample.scr
2 sample.scr
3 sample.scr
Compare with the posted grep -o solution:
$ grep -n -o '[^ "=]*.scr' file
1:sample.scr
2:sample.scr
3:sample.scr
4:unscr
5:sample.scr
Notice those last 2 lines that you probably don't want in the grep output.
Upvotes: 0