aarelovich
aarelovich

Reputation: 5576

How to match this string in bash?

I'm reading a file in bash, line by line. I need to print lines that have the following format:

don't care <<< at least one character >>> don't care.

These are all the way which I have tried and none of them work:

   if [[ $line =~ .*<<<.+>>>.* ]]; then
      echo "$line"
   fi

This has incorrect syntax

These two have correct syntax don't work

   if [[ $line =~ '.*<<<.+>>>.*' ]]; then
      echo "$line"
   fi

And this:

   if [[ $line == '*<<<*>>>*' ]]; then
      echo "$line"
   fi

So how to I tell bash to only print lines with that format? PD: I have tested and printing all lines works just fine.

Upvotes: 2

Views: 62

Answers (4)

Dominique
Dominique

Reputation: 17565

I don't even understand why you are reading the file line per line. I have just launched following command in the bash prompt and it's working fine:

grep "<<<<.+>>>>" test.txt

where test.txt contains following data:

<<<<>>>>
<<<<a>>>>
<<<<aa>>>>

The result of the command was:

<<<<a>>>>
<<<<aa>>>>

Upvotes: 0

glenn jackman
glenn jackman

Reputation: 247162

Don't need regular expression. filename patterns will work just fine:

if [[ $line == *"<<<"?*">>>"* ]]; then ...
  • * - match zero or more characters
  • ? - match exactly one character
  • "<<<" and ">>>" - literal strings: The angle brackets need to be quoted so bash does not interpret them as a here-string redirection.
$ line=foobar
$ [[ $line == *"<<<"?*">>>"* ]] && echo y || echo n
n
$ line='foo<<<>>>bar'
$ [[ $line == *"<<<"?*">>>"* ]] && echo y || echo n
n
$ line='foo<<<x>>>bar'
$ [[ $line == *"<<<"?*">>>"* ]] && echo y || echo n
y
$ line='foo<<<xyz>>>bar'
$ [[ $line == *"<<<"?*">>>"* ]] && echo y || echo n
y

Upvotes: 2

choroba
choroba

Reputation: 242208

<, <<, <<<, >, and >> are special in the shell and need quoting:

[[ $line =~ '<<<'.+'>>>' ]]

. and + shouldn't be quoted, though, to keep their special meaning.

You don't need the leading and trailing .* in =~ matching, but you need them (or their equivalents) in patterns:

[[ $line == *'<<<'?*'>>>'* ]]

It's faster to use grep to extract lines:

grep -E '<<<.+>>>' input-file

Upvotes: 1

Tom Fenech
Tom Fenech

Reputation: 74695

For maximum compatibility, it's always a good idea to define your regex pattern as a separate variable in single quotes, then use it unquoted. This works for me:

re='<<<.+>>>'
if [[ $line =~ $re ]]; then
    echo "$line"
fi

I got rid of the redundant leading/trailing .*, by the way.

Of course, I'm assuming that you have a valid reason to process the file in native bash (if not, just use grep -E '<<<.+>>>' file)

Upvotes: 1

Related Questions