gdeniz
gdeniz

Reputation: 177

How to make "grep" output complete word that includes the match?

I would like grep to print out all complete words that include the match.

Google did not help me. Here what I tried:

cat file.txt
21676   Mm.24685    NM_009346   ENSMUSG00000055320
20349   Mm.134093   NM_011348   ENSMUSG00000063531
12456   Mm.134000   NM_011228   GM415666

grep -o "ENSMUS" file.txt
ENSMUS
ENSMUS

Desired output:

ENSMUSG00000055320
ENSMUSG00000063531

Thanks for your help!

Upvotes: 2

Views: 79

Answers (2)

Timur Shtatland
Timur Shtatland

Reputation: 12347

To extract ENSEMBL mouse accession numbers without the version number:

grep -Po 'ENSMUS\w+' in_file

With the version number:

grep -Po 'ENSMUS\S+' in_file

Here,
\w+ : 1 or more word characters ([A-Za-z0-9_]).
\S+ : 1 or more non-whitespace characters (you can also be more restrictive and use [\w.]+, which is 1 or more word character or literal dot).

Here, GNU grep uses the following options:
-P : Use Perl regexes.
-o : Print the matches only (1 match per line), not the entire lines.

SEE ALSO:
grep manual
perlre - Perl regular expressions

Upvotes: 1

anubhava
anubhava

Reputation: 785316

You may use:

grep -wo "ENSMUS[^[:blank:]]*" file.txt
ENSMUSG00000055320
ENSMUSG00000063531

Here [^[:blank:]]* will match 0 or more characters that are not whitespaces. -w will ensure full word matches.

Upvotes: 1

Related Questions