Vno
Vno

Reputation: 71

Using grep to find words that have a certain prefix and suffix

I'm trying to figure out how to find certain words in a file that start with the letters air and end with the letters ne. I'd like to print the words that it matches with into a new file called "excluded". I'm very new to this environment of command lines so i'm a bit lost. I've read the manual and cannot find a solution.

I was thinking something along the lines of

grep "air" | "ne" textfile.txt

but obviously it's not working out.

edit: I think I can use the ^ and $ operators to find letters at the beginning and end of a word, however i'm unsure as to how to make it one command so I can simply paste the output into a new file.

Upvotes: 6

Views: 20518

Answers (2)

yarl
yarl

Reputation: 161

grep -o '\bair[^[:space:]]*ne\b' textfile | sort | uniq > excluded

From the man page, the -o flag "Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line."

The pattern is composed as follow: match a word edge (\b) then the string 'air' then something that is not a space, multiple times then the string 'ne' then the other word edge

Then we sort so we can uniq (could use sort -u)

The idea is that a word is a word edge followed by multiple non space characters followed by another word edge.

This is not perfect because it matches characters that are usually not parts of words like "airfoo_ne", "air.barne", etc, but you can improve it once you get the idea.

Upvotes: 2

Anthony C. Nguyen
Anthony C. Nguyen

Reputation: 76

In order to print the words into a new file, you'll want to use the ">" operator to send the output of grep into a file, so the command would be:

grep '^air.*ne$' textfile.txt > excluded.txt

or, if you prefer to use pipes, something along the lines of:

cat textfile.txt | grep '^air.*ne$' > excluded.txt

would also work. Of course, this assumes that you're in the folder containing textfile.txt.

For test data

airkinglyne\nairlamne\nhelloworld\nairfatne

the output is:

airkinglyne\nairlamne\nairfatne

Upvotes: 5

Related Questions