Reputation: 13

Regular expression not showing multiple line content

I have a file with following format.

<hello>
<random1>
<random2>
....
....
....
<random100>
<bye>

I want to find whether bye and hello are there, and bye is below hello. I tried this regular expression.

grep "hello.*bye" filename

but it fails to match what I expected.

Upvotes: 0

Answers (5)

Ed Morton

Reputation: 204558

With GNU awk for a multi-char RS:

awk -v RS='^$' '{print (/hello.*bye/ ? "y" : "n")}'

Upvotes: 1

glenn jackman

Reputation: 247210

Perl:

perl -0777 -lne 'print (/hello.*bye/s ? "y" : "n")'

perl -0777 -ne 'exit(! /hello.*bye/s)'

The -0777 options slurps the whole file as a single string. The "s" flag tells perl to allow "." to match a newline.

Upvotes: 1

Adrian Frühwirth

Reputation: 45686

$ cat file1.txt
<hello>
<bye>

$ awk '/<hello>/ {hello=1} /<bye>/&&hello {bye=1; exit} END {exit !(hello && bye)}' \
    file1.txt \
    && echo found || echo not found
found

$ cat file2.txt
<bye>
<hello>

$ awk '/<hello>/ {hello=1} /<bye>/&&hello {bye=1; exit} END {exit !(hello && bye)}' \
    file2.txt \
    && echo found || echo not found
not found

Upvotes: 1

devnull

Reputation: 123648

You could use pcregrep:

pcregrep -M 'hello(\n|.)*bye' filename

The -M option makes it possible to search for patterns that span line boundaries.

For your input, it'd produce:

<hello>
<random1>
<random2>
....
....
....
<random100>
<bye>

Upvotes: 2

mklement0

Reputation: 440337

IF the input file is small enough, you can try:

grep "hello.*bye"  <(tr $'\n' ' ' < filename)

This replaces all newlines with spaces and thus turns the file contents into a single line that grep searches at once.

If you'd rather simply remove newlines, use:

grep "hello.*bye"  <(tr -d $'\n' < filename)

Upvotes: 1

Regular expression not showing multiple line content

Answers (5)

Related Questions