Matt Parkins
Matt Parkins

Reputation: 24688

Grep Regex: List all lines except

I'm trying to automagically remove all lines from a text file that contains a letter "T" that is not immediately followed by a "H". I've been using grep and sending the output to another file, but I can't come up with the magic regex that will help me do this.

I don't mind using awk, sed, or some other linux tool if grep isn't the right tool to be using.

Upvotes: 52

Views: 99382

Answers (4)

RavinderSingh13
RavinderSingh13

Reputation: 133458

Adding 2 awk solutions to the mix here.



1st solution(simpler solution): With simple awk and any version of awk.

awk '!/T/ || /TH/' Input_file

Checking 2 conditions:

  • If a line doesn't contain T OR

  • If a line contains TH then:

    If any of above condition is TRUE then print that line simply.



2nd solution(GNU awk specific): Using GNU awk using match function where mentioning regex (T)(.|$) and using match function's array creation capability.

awk '
!/T/{
  print
  next
}
match($0,/(T)(.|$)/,arr) && arr[1]=="T" && arr[2]=="H"
' Input_file

Explanation: firstly checking if a line doesn't have T then print that simply. Then using match function of awk to match T followed by any character OR end of the line. Since these are getting stored into 2 capturing groups so checking if array arr's 1st element is T and 2nd element is H then print that line.

Upvotes: 0

ADV-IT
ADV-IT

Reputation: 831

Read lines from file exclude EMPTY Lines and Lines starting with #

grep -v '^$\|^#' folderlist.txt

folderlist.txt

# This is list of folders
folder1/test
folder2
# This is comment
folder3

folder4/backup
folder5/backup

Results will be:

folder1/test
folder2
folder3
folder4/backup
folder5/backup

Upvotes: 1

byrondrossos
byrondrossos

Reputation: 2117

That should do it:

grep -v 'T[^H]'

-v : print lines not matching

[^H]: matches any character but H

Upvotes: 100

codaddict
codaddict

Reputation: 454960

You can do:

grep -v 'T[^H]' input

-v is the inverse match option of grep it does not list the lines that match the pattern.

The regex used is T[^H] which matches any lines that as a T followed by any character other than a H.

Upvotes: 18

Related Questions