Reputation: 523
My awk script search the log file fro two search patterns and then print strings where it found into another file.
awk -F# -v pat1="$search_pattern1" -v pat2="$search_pattern2" '{ for (i = 1; i <= NF; i++) {if (match($i, "^1\\.[0-9]+\\/\\A "pat1)) {sub(/^1\./, "", $i); sub(/\/.*/, "", $i); if (first == "") first = $i; if ($i in b) {first = $i; exit} a[$i]} else if (match($i, "^1\\.[0-9]+\\/\\A "pat2)) {sub(/^1\./, "", $i); sub(/\/.*/, "", $i); if ($i in a) {first = $i; exit} b[$i] }}} END {if (first == "") print "1"; else print first}' search_file.log
I have a question related to :
{if (match($i, "^1\\.[0-9]+\\/\\A "pat1))
Presently it find the pat1 just near the "/A" for example in the string like
06I_nsp5holoHIE_pp2.pdb #1.1/A pat1 NE2
How could I modify the regex to be able to find pat 1 either near /A or near /?, so to be able to identify it additionally in the string like:
06I_nsp5holoHIE_pp2.pdb #1.1/? pat1 NE2
Upvotes: 0
Views: 46
Reputation: 2805
mawk 'sub("$","\f" index($_,__)) < NF' \ \ FS='^[^#]*.1[.][0-9]+[/][?A][ ]+|[ ]+' __='pat1'
06I_nsp5holoHIE_pp2.pdb #1.1/? pat1 NE2
32
to cross-validate that 32….
gcut -c 32- <<< '06I_nsp5holoHIE_pp2.pdb #1.1/? pat1 NE2 '
pat1 NE2
Upvotes: 0
Reputation: 36370
How could I modify the regex to be able to find pat 1 either near /A or near /?
I would use [
and ]
with enumerated acceptable characters inside it. Consider simplified example, let file.txt
content be
1.1/A pat1 NE2
1.1/? pat1 NE2
1.1/Z pat1 NE2
then
awk 'match($0,"1\\.[0-9]+\\/[A?]"){print NR, RSTART, RLENGTH}' file.txt
gives output
1 1 5
2 1 5
Explanation: I do print number of row, position of start of match, length of match, if match was found. Observe that ?
means literal ?
inside [
and ]
. You might elect to use |
(alternative) rather than [
and ]
but in such case you must escape ?
.
(tested in gawk 4.2.1)
Upvotes: 2