Reputation: 24688
I'm trying to automagically remove all lines from a text file that contains a letter "T" that is not immediately followed by a "H". I've been using grep and sending the output to another file, but I can't come up with the magic regex that will help me do this.
I don't mind using awk, sed, or some other linux tool if grep isn't the right tool to be using.
Upvotes: 52
Views: 99382
Reputation: 133458
Adding 2 awk
solutions to the mix here.
1st solution(simpler solution): With simple awk
and any version of awk
.
awk '!/T/ || /TH/' Input_file
Checking 2 conditions:
If a line doesn't contain T
OR
If a line contains TH
then:
If any of above condition is TRUE then print that line simply.
2nd solution(GNU awk
specific): Using GNU awk
using match
function where mentioning regex (T)(.|$)
and using match
function's array creation capability.
awk '
!/T/{
print
next
}
match($0,/(T)(.|$)/,arr) && arr[1]=="T" && arr[2]=="H"
' Input_file
Explanation: firstly checking if a line doesn't have T
then print that simply. Then using match
function of awk
to match T
followed by any character OR end of the line. Since these are getting stored into 2 capturing groups so checking if array arr's 1st element is T and 2nd element is H then print that line.
Upvotes: 0
Reputation: 831
Read lines from file exclude EMPTY Lines and Lines starting with #
grep -v '^$\|^#' folderlist.txt
folderlist.txt
# This is list of folders
folder1/test
folder2
# This is comment
folder3
folder4/backup
folder5/backup
Results will be:
folder1/test
folder2
folder3
folder4/backup
folder5/backup
Upvotes: 1
Reputation: 2117
That should do it:
grep -v 'T[^H]'
-v : print lines not matching
[^H]: matches any character but H
Upvotes: 100
Reputation: 454960
You can do:
grep -v 'T[^H]' input
-v
is the inverse match option of grep it does not list the lines that match the pattern.
The regex used is T[^H]
which matches any lines that as a T
followed by any character other than a H
.
Upvotes: 18