rotem
rotem

Reputation: 113

notepad regex - lines without character occuring n times

I'm looking for correct regex to find lines with less then n times the TAB (\t) character.

I tried this one but it finds nothing:

^.*(?:\t.*){0,20}\r\n

Upvotes: 2

Views: 350

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627507

Your pattern contains a .* at the start (after ^, start of string/line anchor), and it matches any zero or more chars other than line break chars, as many as possible. So, it can match any amount of tabs. Then, (?:\t.*){0,20} matches zero, one ... twenty occurrences of a tab and then again any zero or more chars other than line break chars as many as possible.

In the end, the regex does not restrict the amount of tabs on a line at all.

To match lines having no more than N amount of tabs you need

^(?!(?:[^\t\r\n]*\t){N+1}).*

where N is your occurrence count. So, if you want to match (and later remove, since you have \r\n at the end of the regex) lines having no more than 20 tabs, you can use

^(?!(?:[^\t\r\n]*\t){21}).*\R?

See the regex demo.

Details:

  • ^ - start of string/line
  • (?!(?:[^\t\r\n]*\t){21}) - a negative lookahead that fails the match if there are twenty-one occurrences of zero or more chars other than CR, LF and TAB followed with a TAB char immediately to the right of the current location
  • .* - the rest of the line
  • \R? - an optional line break sequence (CRLF, LF or CR).

Upvotes: 1

Related Questions