Reputation: 15970
I needed to find all the files that contained a specific string pattern. The first solution that comes to mind is using find piped with xargs grep:
find . -iname '*.py' | xargs grep -e 'YOUR_PATTERN'
But if I need to find patterns that spans on more than one line, I'm stuck because vanilla grep can't find multiline patterns.
Upvotes: 175
Views: 171460
Reputation: 10698
As Amit's answer earlier, you can use awk to search for multiple lines. In case you need to print the line number, use the following:
awk '/Start pattern/,/End pattern/ {print NR ":" $0}' filename
Upvotes: 0
Reputation: 166889
Using ex
/vi
editor and globstar option (syntax similar to awk
and sed
):
ex +"/string1/,/string3/p" -R -scq! file.txt
where aaa
is your starting point, and bbb
is your ending text.
To search recursively, try:
ex +"/aaa/,/bbb/p" -scq! **/*.py
Note: To enable **
syntax, run shopt -s globstar
(Bash 4 or zsh).
Upvotes: 2
Reputation: 171
You can use the grep alternative sift here (disclaimer: I am the author).
It support multiline matching and limiting the search to specific file types out of the box:
sift -m --files '*.py' 'YOUR_PATTERN'
(search all *.py files for the specified multiline regex pattern)
It is available for all major operating systems. Take a look at the samples page to see how it can be used to to extract multiline values from an XML file.
Upvotes: 4
Reputation: 51
@Marcin: awk example non-greedy:
awk '{if ($0 ~ /Start pattern/) {triggered=1;}if (triggered) {print; if ($0 ~ /End pattern/) { exit;}}}' filename
Upvotes: 5
Reputation: 2509
With silver searcher:
ag 'abc.*(\n|.)*efg'
Speed optimizations of silver searcher could possibly shine here.
Upvotes: 11
Reputation: 15970
Here is a more useful example:
pcregrep -Mi "<title>(.*\n){0,5}</title>" afile.html
It searches the title tag in a html file even if it spans up to 5 lines.
Here is an example of unlimited lines:
pcregrep -Mi "(?s)<title>.*</title>" example.html
Upvotes: 22
Reputation: 38532
grep -P
also uses libpcre, but is much more widely installed. To find a complete title
section of an html document, even if it spans multiple lines, you can use this:
grep -P '(?s)<title>.*</title>' example.html
Since the PCRE project implements to the perl standard, use the perl documentation for reference:
Upvotes: 26
Reputation: 15970
So I discovered pcregrep which stands for Perl Compatible Regular Expressions GREP.
the -M option makes it possible to search for patterns that span line boundaries.
For example, you need to find files where the '_name' variable is followed on the next line by the '_description' variable:
find . -iname '*.py' | xargs pcregrep -M '_name.*\n.*_description'
Tip: you need to include the line break character in your pattern. Depending on your platform, it could be '\n', \r', '\r\n', ...
Upvotes: 109
Reputation: 10510
Here is the example using GNU grep
:
grep -Pzo '_name.*\n.*_description'
-z
/--null-data
Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline.
Which has the effect of treating the whole file as one large line.
See -z
description on grep's manual and also common question no 14 on grep's manual usage page
Upvotes: 129
Reputation: 3447
Why don't you go for awk:
awk '/Start pattern/,/End pattern/' filename
Upvotes: 123
Reputation: 227
I believe the following should work and has the advantage of only using extended regular expressions without the need to install an extra tool like pcregrep
if you don’t have it yet or don’t have the -P
option to grep available (eg. macOS):
egrep -irzo “.*aaa(.*\s.*){1,}.*bbb.*" path_to_filenames
Caveat emptor: this does some slight disadvantages:
aaa
to the last bbb
in each file, unless...aaa
[stuff] bbb
pattern in each file.Upvotes: 0
Reputation: 12970
This answer might be useful:
Regex (grep) for multi-line search needed
To find recursively you can use flags -R (recursive) and --include (GLOB pattern). See:
Use grep --exclude/--include syntax to not grep through certain files
Upvotes: 5