How to exclude lines with ... in regular expression

Question

I have the following table of contents and sections in my file:

1.2 Purpose .................... 8  
1.3 System Overview ............ 8  
1.4 Document Overview .......... 8  
1.5 Definitions and Acronyms ......... 9  
2.1.3.3.8   FOO 
2.1.3.3.9  BAR 
2.1.4 TEST

I'd like to extract the section names and ignore the lines that are part of the table of contents.

I've been trying this regular expression:

^((?:\d{1,2}\.)+(?:\d{1,2})+)\s.+(?!\.\.\.).*$

However, I keep capturing the table of contents lines.

How can I exclude the lines with the .... strings?

Thanks!

Charles Duffy · Accepted Answer

The problem here was that you were only excluding .s at a very specific place; your negative lookahead match didn't go beyond the position it was placed in. Consider instead:

^(\d{1,2}(?:\.\d{1,2})*)\s*[^.]*(?!.*\.{3}).*$
#                                  ^^

...the characters with the carrot below them are critical: They make the negative lookahead apply not only at that specific point, but at anywhere after it as well.

How to exclude lines with ... in regular expression

Answers (1)

Related Questions