Reputation: 11
I have a text file with the following output.
good,bad,ugly
good,good,ugly
good,good,good,bad,ugly
good,bad,bad
bad,bad,bad,bad,good
bad,ugly,good
bad,good,bad
good,good,good,good,bad
ugly,bad,good
bad,bad,bad,good,ugly
I only want to list lines that have a single occurrence of ugly and bad. Any line with multiple bads needs to be excluded.
good,bad,ugly
good,good,good,bad,ugly
bad,ugly,good
ugly,bad,good
I've tried to use the following, but it still lists lines with multiple bads.
grep -E "bad|ugly" file.txt | grep -v "\('bad'\).*\1"
Upvotes: -1
Views: 85
Reputation: 204416
grep
isn't the best choice for data that contains fields or whenever your requirements have multiple conditions or arithmetic to test. Using any awk
:
$ awk -F, '
{ delete cnt; for (i=1; i<=NF; i++) cnt[$i]++ }
(cnt["ugly"] == 1) && (cnt["bad"] == 1)
' file
good,bad,ugly
good,good,good,bad,ugly
bad,ugly,good
ugly,bad,good
Unlike the grep
solutions posted so far, the above would do the [presumably] right thing if your input contained other similar strings like badlands
or your target strings contained regexp metachars like b.*
.
Also imagine how trivial it'd be to update that vs updating a grep command to work with counts of any additional strings and/or different counts of bad and ugly.
Upvotes: 2
Reputation: 26727
You have to use -P
(for Perl-compatible regular expressions) for back-references.
grep -E "bad|ugly" file.txt | grep -Pv "(bad).*\1"
Upvotes: 0
Reputation: 1246
Your current approach using grep -E "bad|ugly" matches any line with either "bad" OR "ugly", and the back-reference attempt isn't quite working.
grep -E 'bad.*ugly|ugly.*bad' file.txt | grep -v 'bad.*bad'
This will give you:
good,bad,ugly
good,good,ugly,bad
ugly,bad,good
Upvotes: 3