user2189312
user2189312

Reputation: 145

multi-lines pattern matching

I have some files with content like this:

file1:

AAA
BBB
CCC
123

file2:

AAA
BBB
123

I want to echo the filename only if the first 3 lines are letters, or "file1" in the samples above. Im merging the 3 lines into one and comparing it to my regex [A-Z], but could not get it to match for some reason

my script:

file=file1    
if [[ $(head -3 $file|tr -d '\n'|sed 's/\r//g') == [A-Z] ]]; then
    echo "$file"
fi

I ran it with bash -x, this is the output

+ file=file1
++ head -3 file1
++ tr -d '\n'
++ sed 's/\r//g'
+ [[ ASMUTCEDD == [A-Z] ]]
+exit

Upvotes: 1

Views: 62

Answers (3)

janos
janos

Reputation: 124646

What you missed:

  • You can use grep to check that the input matches only [A-Z] characters (or indeed Bash's built-in regex matching, as @Barmar pointed out)
  • You can use the pipeline directly in the if statement, without [[ ... ]]

Like this:

file=file1    
if head -n 3 "$file" | tr -d '\n\r' | grep -qE '^[A-Z]+$'; then
    echo "$file"
fi

Upvotes: 1

Barmar
Barmar

Reputation: 780984

To do regular expression matching you have to use =~, not ==. And the regular expression should be ^[A-Z]*$. Your regular expression matches if there's a letter anywhere in the string, not just if the string is entirely letters.

if [[ $(head -3 $file|tr -d '\n\r') =~ ^[A-Z]*$ ]]; then
    echo "$file"
fi

Upvotes: 1

Yoda
Yoda

Reputation: 445

You can use built-ins and character classes for this problem:-

#!/bin/bash

file="file1"
C=0
flag=0

while read line
do
        (( ++C ))

        [ $C -eq 4 ] && break;

        [[ "$line" =~ '[^[:alpha:]]' ]] && flag=1


done < "$file"

[ $flag -eq 0 ] && echo "$file"

Upvotes: 0

Related Questions