Reputation: 91
I have seen commands such as using sed to remove lines based on number of characters but not words.
eg. I have a text file such as
word1
word1 word2
word1 word2 word3
word1 word2 word3 word4
word1 word2 word4 word5
How would i use (sed or awk) to remove the lines with less than 3 words so output looks like:
word1 word2 word3
word1 word2 word3 word4
word1 word2 word4 word5
Upvotes: 2
Views: 3778
Reputation: 5092
You can try is sed
command
sed -n 's/\([^ ]\+ \)\{2,\}/&/p' file_name
[^ ] - until space match each characters
{2,} - which is used to match the preceding pattern more than 2
([^ ]\+ ) - Which is used to match the word.
Upvotes: 1
Reputation: 10039
sed -n '/[^ ]\([^ ]* *[^ ]\)\{2\}/ p' YourFile
# or
sed -n '/[^ ] *[^ ][^ ]* *[^ ]/ p' YourFile
Regx is: At least 1 non space with at least 1 space with at least 1 non space with at least 1 space with at least 1 non space
to ensure that (word1 word2
) is not taking sourround space as word separator with no word to separe at the extremities
Upvotes: 1
Reputation: 174786
You could do this simply in awk,
$ awk 'NF>=3' file
word1 word2 word3
word1 word2 word3 word4
word1 word2 word4 word5
It prints the lines which has three or more fields.
Upvotes: 6
Reputation: 41460
Here is how to do it with awk
, If its more than 2
fields, print it:
awk 'NF>2' file
word1 word2 word3
word1 word2 word3 word4
word1 word2 word4 word5
Upvotes: 4