Reputation: 71
I'm trying to figure out how to find certain words in a file that start with the letters air
and end with the letters ne
. I'd like to print the words that it matches with into a new file called "excluded". I'm very new to this environment of command lines so i'm a bit lost. I've read the manual and cannot find a solution.
I was thinking something along the lines of
grep "air" | "ne" textfile.txt
but obviously it's not working out.
edit: I think I can use the ^
and $
operators to find letters at the beginning and end of a word, however i'm unsure as to how to make it one command so I can simply paste the output into a new file.
Upvotes: 6
Views: 20518
Reputation: 161
grep -o '\bair[^[:space:]]*ne\b' textfile | sort | uniq > excluded
From the man page, the -o flag "Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line."
The pattern is composed as follow: match a word edge (\b) then the string 'air' then something that is not a space, multiple times then the string 'ne' then the other word edge
Then we sort so we can uniq (could use sort -u)
The idea is that a word is a word edge followed by multiple non space characters followed by another word edge.
This is not perfect because it matches characters that are usually not parts of words like "airfoo_ne", "air.barne", etc, but you can improve it once you get the idea.
Upvotes: 2
Reputation: 76
In order to print the words into a new file, you'll want to use the ">" operator to send the output of grep into a file, so the command would be:
grep '^air.*ne$' textfile.txt > excluded.txt
or, if you prefer to use pipes, something along the lines of:
cat textfile.txt | grep '^air.*ne$' > excluded.txt
would also work. Of course, this assumes that you're in the folder containing textfile.txt.
For test data
airkinglyne\nairlamne\nhelloworld\nairfatne
the output is:
airkinglyne\nairlamne\nairfatne
Upvotes: 5