Andrés Parada
Andrés Parada

Reputation: 319

How to search a pattern including the whitespaces in awk?

I want search a pattern which is kinda complex, I already learned I have to indicate \| instead of | in the script but how do I include the whitespace to match this exact pattern.

    TR40663|c0_g1_i2|m.33339 TR40663|c0_g1_i2|g.33339 ORF TR40663|c0_g1_i2|g.33339 TR40663|c0_g1_i2|m.33339 type:5prime_partial len:1730 (+) TR40663\|c0_g1_i2:3-5192(+) [specie]

I have to use this code which retrieves a sequence related with the pattern

    awk 'BEGIN{RS=">";FS="\n"}NR>1{if ($1~/pattern) print ">"$0}' file

I don't know if the ~/ is also messing with the code. Later on I will pass a list of elements inside multiple files but for now I want to check this pattern/search first.

Thanks for the help

Upvotes: 0

Views: 2083

Answers (2)

glenn jackman
glenn jackman

Reputation: 246764

if pattern is an awk variable that holds a string representing the regex, then you have to write

if ($1 ~ pattern)

with no slashes.

if pattern is just a placeholder in your question for the actual regex, then you're missing the ending slash:

if ($1 ~ /pattern\|goes\|here/)

Notes:

  • ~/ is not an awk operator
  • the regex matching operator is ~
  • literal regex patterns are enclosed with slashes: /foo.*bar/

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 203229

~/ is not an operator. ~ is the regexp comparsion operator and /.../ are the static regexp delimiters. Get rid of the / from ~/ as I'm sure the syntax error is already telling you to do.

The syntax for using dymamic regexps is:

awk -v re='foo \\| bar' '$0 ~ re' file

or:

awk -v re='foo [|] bar' '$0 ~ re' file

Never use the word pattern, btw as it's ambiguous and misleading. In awk you should always use the words regexp or string while shell uses globbing patterns that are similar to regexps in functionality and syntax but very different in semantics.

Upvotes: 1

Related Questions