hearsaxas
hearsaxas

Reputation: 3021

Match two strings in one line with grep

I am trying to use grep to match lines that contain two different strings. I have tried the following but this matches lines that contain either string1 or string2 which not what I want.

grep 'string1\|string2' filename

So how do I match with grep only the lines that contain both strings?

Upvotes: 301

Views: 499643

Answers (24)

dheerosaur
dheerosaur

Reputation: 15172

You can use

grep 'string1' filename | grep 'string2'

This searches for string1 followed by string 2 on the same line, or string2 followed by string1 on the same line; it does not answer the question:

grep 'string1.*string2\|string2.*string1' filename

Upvotes: 267

user45949
user45949

Reputation: 2580

This searches for string1 OR string2 in filename:

grep -E "string1|string2" filename

This searches for lines where string1 is followed by string2 on the same line, or string2 is followed by string1 on the same line, in filename:

grep 'string1.*string2\|string2.*string1' filename

Note that neither of these answer the question.

Upvotes: 225

Gayan Weerakutti
Gayan Weerakutti

Reputation: 13831

Much simpler command to grep both the strings:

(cat file | grep 'phrase_1') && (cat file | grep 'phrase_2')

Upvotes: 0

nextloop
nextloop

Reputation: 342

searching for two String and highlight only string1 and string2

grep -E 'string1.*string2|string2.*string1' filename | grep -E 'string1|string2'
  • or
grep 'string1.*string2\|string2.*string1' filename | grep -E 'string1\|string2'

Upvotes: 0

Md Raihan Ahmed
Md Raihan Ahmed

Reputation: 81

If git is initialized and added to the branch then it is better to use git grep because it is super fast and it will search inside the whole directory.

git grep 'string1.*string2.*string3'

Upvotes: 0

eQ19
eQ19

Reputation: 10711

When the both strings are in sequence then put a pattern in between on grep command:

$ grep -E "string1(?.*)string2" file

Example if the following lines are contained in a file named Dockerfile:

FROM python:3.8 as build-python
FROM python:3.8-slim

To get the line that contains the strings: FROM python and as build-python then use:

$ grep -E "FROM python:(?.*) as build-python" Dockerfile

Then the output will show only the line that contain both strings:

FROM python:3.8 as build-python

Upvotes: 0

user6536435
user6536435

Reputation:

grep ‘string1\|string2’ FILENAME 

GNU grep version 3.1

Upvotes: 3

kenorb
kenorb

Reputation: 166843

ripgrep

Here is the example using rg:

rg -N '(?P<p1>.*string1.*)(?P<p2>.*string2.*)' file.txt

It's one of the quickest grepping tools, since it's built on top of Rust's regex engine which uses finite automata, SIMD and aggressive literal optimizations to make searching very fast.

Use it, especially when you're working with a large data.

See also related feature request at GH-875.

Upvotes: -1

kenorb
kenorb

Reputation: 166843

git grep

Here is the syntax using git grep with multiple patterns:

git grep --all-match --no-index -l -e string1 -e string2 -e string3 file

You may also combine patterns with Boolean expressions such as --and, --or and --not.

Check man git-grep for help.


--all-match When giving multiple pattern expressions, this flag is specified to limit the match to files that have lines to match all of them.

--no-index Search files in the current directory that is not managed by Git.

-l/--files-with-matches/--name-only Show only the names of files.

-e The next parameter is the pattern. Default is to use basic regexp.

Other params to consider:

--threads Number of grep worker threads to use.

-q/--quiet/--silent Do not output matched lines; exit with status 0 when there is a match.

To change the pattern type, you may also use -G/--basic-regexp (default), -F/--fixed-strings, -E/--extended-regexp, -P/--perl-regexp, -f file, and other.

Related:

For OR operation, see:

Upvotes: 3

Ed Morton
Ed Morton

Reputation: 204488

Don't try to use grep for this, use awk instead. To match 2 regexps R1 and R2 in grep you'd think it would be:

grep 'R1.*R2|R2.*R1'

while in awk it'd be:

awk '/R1/ && /R2/'

but what if R2 overlaps with or is a subset of R1? That grep command simply would not work while the awk command would. Lets say you want to find lines that contain the and heat:

$ echo 'theatre' | grep 'the.*heat|heat.*the'
$ echo 'theatre' | awk '/the/ && /heat/'
theatre

You'd have to use 2 greps and a pipe for that:

$ echo 'theatre' | grep 'the' | grep 'heat'
theatre

and of course if you had actually required them to be separate you can always write in awk the same regexp as you used in grep and there are alternative awk solutions that don't involve repeating the regexps in every possible sequence.

Putting that aside, what if you wanted to extend your solution to match 3 regexps R1, R2, and R3. In grep that'd be one of these poor choices:

grep 'R1.*R2.*R3|R1.*R3.*R2|R2.*R1.*R3|R2.*R3.*R1|R3.*R1.*R2|R3.*R2.*R1' file
grep R1 file | grep R2 | grep R3

while in awk it'd be the concise, obvious, simple, efficient:

awk '/R1/ && /R2/ && /R3/'

Now, what if you actually wanted to match literal strings S1 and S2 instead of regexps R1 and R2? You simply can't do that in one call to grep, you have to either write code to escape all RE metachars before calling grep:

S1=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<< 'R1')
S2=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<< 'R2')
grep 'S1.*S2|S2.*S1'

or again use 2 greps and a pipe:

grep -F 'S1' file | grep -F 'S2'

which again are poor choices whereas with awk you simply use a string operator instead of regexp operator:

awk 'index($0,S1) && index($0.S2)'

Now, what if you wanted to match 2 regexps in a paragraph rather than a line? Can't be done in grep, trivial in awk:

awk -v RS='' '/R1/ && /R2/'

How about across a whole file? Again can't be done in grep and trivial in awk (this time I'm using GNU awk for multi-char RS for conciseness but it's not much more code in any awk or you can pick a control-char you know won't be in the input for the RS to do the same):

awk -v RS='^$' '/R1/ && /R2/'

So - if you want to find multiple regexps or strings in a line or paragraph or file then don't use grep, use awk.

Upvotes: 9

Saurabh
Saurabh

Reputation: 7833

grep -i -w 'string1\|string2' filename

This works for exact word match and matching case insensitive words ,for that -i is used

Upvotes: 1

tink
tink

Reputation: 15238

And as people suggested perl and python, and convoluted shell scripts, here a simple awk approach:

awk '/string1/ && /string2/' filename

Having looked at the comments to the accepted answer: no, this doesn't do multi-line; but then that's also not what the author of the question asked for.

Upvotes: 7

Amit Singh
Amit Singh

Reputation: 41

Let's say we need to find count of multiple words in a file testfile. There are two ways to go about it

1) Use grep command with regex matching pattern

grep -c '\<\(DOG\|CAT\)\>' testfile

2) Use egrep command

egrep -c 'DOG|CAT' testfile 

With egrep you need not to worry about expression and just separate words by a pipe separator.

Upvotes: 2

ruanhao
ruanhao

Reputation: 4922

I often run into the same problem as yours, and I just wrote a piece of script:

function m() { # m means 'multi pattern grep'

    function _usage() {
    echo "usage: COMMAND [-inH] -p<pattern1> -p<pattern2> <filename>"
    echo "-i : ignore case"
    echo "-n : show line number"
    echo "-H : show filename"
    echo "-h : show header"
    echo "-p : specify pattern"
    }

    declare -a patterns
    # it is important to declare OPTIND as local
    local ignorecase_flag  filename linum header_flag colon result OPTIND

    while getopts "iHhnp:" opt; do
    case $opt in
        i)
        ignorecase_flag=true ;;
        H)
        filename="FILENAME," ;;
        n)
        linum="NR," ;;
        p)
        patterns+=( "$OPTARG" ) ;;
        h)
        header_flag=true ;;
        \?)
        _usage
        return ;;
    esac
    done

    if [[ -n $filename || -n $linum ]]; then
    colon="\":\","
    fi

    shift $(( $OPTIND - 1 ))

    if [[ $ignorecase_flag == true ]]; then
    for s in "${patterns[@]}"; do
            result+=" && s~/${s,,}/"
    done
    result=${result# && }
    result="{s=tolower(\$0)} $result"
    else
    for s in "${patterns[@]}"; do
            result="$result && /$s/"
    done
    result=${result# && }
    fi

    result+=" { print "$filename$linum$colon"\$0 }"

    if [[ ! -t 0 ]]; then       # pipe case
    cat - | awk "${result}"
    else
    for f in "$@"; do
        [[ $header_flag == true ]] && echo "########## $f ##########"
        awk "${result}" $f
    done
    fi
}

Usage:

echo "a b c" | m -p A 
echo "a b c" | m -i -p A # a b c

You can put it in .bashrc if you like.

Upvotes: 0

James
James

Reputation: 11

grep '(string1.*string2 | string2.*string1)' filename

will get line with string1 and string2 in any order

Upvotes: 1

Cristian
Cristian

Reputation: 578

Found lines that only starts with 6 spaces and finished with:

 cat my_file.txt | grep
 -e '^      .*(\.c$|\.cpp$|\.h$|\.log$|\.out$)' # .c or .cpp or .h or .log or .out
 -e '^      .*[0-9]\{5,9\}$' # numers between 5 and 9 digist
 > nolog.txt

Upvotes: 2

Tim Seed
Tim Seed

Reputation: 5289

Place the strings you want to grep for into a file

echo who    > find.txt
echo Roger >> find.txt
echo [44][0-9]{9,} >> find.txt

Then search using -f

grep -f find.txt BIG_FILE_TO_SEARCH.txt 

Upvotes: 1

Raghuram
Raghuram

Reputation: 3967

You should have grep like this:

$ grep 'string1' file | grep 'string2'

Upvotes: 0

Leo
Leo

Reputation: 2326

Your method was almost good, only missing the -w

grep -w 'string1\|string2' filename

Upvotes: 14

Kinjal Dixit
Kinjal Dixit

Reputation: 7945

To search for files containing all the words in any order anywhere:

grep -ril \'action\' | xargs grep -il \'model\' | xargs grep -il \'view_type\'

The first grep kicks off a recursive search (r), ignoring case (i) and listing (printing out) the name of the files that are matching (l) for one term ('action' with the single quotes) occurring anywhere in the file.

The subsequent greps search for the other terms, retaining case insensitivity and listing out the matching files.

The final list of files that you will get will the ones that contain these terms, in any order anywhere in the file.

Upvotes: 32

Aquarius Power
Aquarius Power

Reputation: 3985

for multiline match:

echo -e "test1\ntest2\ntest3" |tr -d '\n' |grep "test1.*test3"

or

echo -e "test1\ntest5\ntest3" >tst.txt
cat tst.txt |tr -d '\n' |grep "test1.*test3\|test3.*test1"

we just need to remove the newline character and it works!

Upvotes: 0

tchrist
tchrist

Reputation: 80443

If you have a grep with a -P option for a limited perl regex, you can use

grep -P '(?=.*string1)(?=.*string2)'

which has the advantage of working with overlapping strings. It's somewhat more straightforward using perl as grep, because you can specify the and logic more directly:

perl -ne 'print if /string1/ && /string2/'

Upvotes: 26

Dorn
Dorn

Reputation: 185

You could try something like this:

(pattern1.*pattern2|pattern2.*pattern1)

Upvotes: 7

martineno
martineno

Reputation: 2635

The | operator in a regular expression means or. That is to say either string1 or string2 will match. You could do:

grep 'string1' filename | grep 'string2'

which will pipe the results from the first command into the second grep. That should give you only lines that match both.

Upvotes: 6

Related Questions