Reputation: 235
I'm trying to run through some code files and find lines that don't end in a semicolon.
I currently have this: ^(?:(?!;).)*$
from a bunch of Googling, and it works just fine. But now I want to expand on it so it ignores all the whitespace at the start or specific keywords like package or opening and closing braces.
The end goal is to take something like this:
package example
{
public class Example
{
var i = 0
var j = 1;
// other functions and stuff
}
}
And for the pattern to show me var i = 0
is missing a semi colon. That's just an example, the missing semi colon could be anywhere in class.
Any ideas? I've been fiddling for over an hour but no luck.
Thanks.
Upvotes: 7
Views: 10121
Reputation: 154063
This is the regular expression line I'm using to highlight lines of Java code that don't end in semicolon and aren't one of the lines in java that aren't supposed to have a semicolon at the end... using vim's regular expression engine.
\(.\+[^; ]$\)\(^.*public.*\|.*//.*\|.*interface.*\|.*for.*\|.*class.*\|.*try.*\|^\s*if\s\+.*\|.*private.*\|.*new.*\|.*else.*\|.*while.*\|.*protected.*$\)\@<!
^ ^ ^
| | negative lookbehind feature
| |
| 2. But not where such matches are preceeded by these keywords
|
|
1. Group of at least some anychar preceeding a missing semicolon
Mnemonics for deciphering glyphs:
^ beginning of line
.* Any amount of any char
+ at least one
[^ ... ] everything but
$ end of line
\( ... \) group
\| delimiter
\@<! negative lookbehind
Which roughly translates to:
Find me all lines that don't end in a semicolon and don't have any of the above keywords/expressions to the left of it. It's not perfect and probably doesn't hold up to obfuscated java, but for simple java programs it highlights the lines that should have semicolons at the end, but don't.
Image showing how this expression is working out for me:
Helpful link that helped me get the concepts I needed:
https://jbodah.github.io/blog/2016/11/01/positivenegative-lookaheadlookbehind-vim/
Upvotes: 1
Reputation: 347
The key to capturing this complicated concept in a regex is to first understand how your regular expression engine/interpreter handles the following concepts:
Then you can begin to understand how to capture what you want, but only in such cases where what's ahead and what's behind is exactly as you specify.
str.scan(/^\s*(?=\S)(?!package.+\n|public.+\n|\/\/|\{|\})(.+)(?<!;)\s*$/)
Upvotes: 1
Reputation: 8775
Try this:
^\s*(?!package|public|class|//|[{}]).*(?<!;\s*)$
When tested in PowerShell:
PS> (gc file.txt) -match '^\s*(?!package|public|class|//|[{}]).*(?<!;\s*)$'
var i = 0
PS>
Upvotes: 1
Reputation: 728
You are trying to match lines that possibly begin with whitespace ^\s*
, then don't have a particular set of words, for example (?!package|class)
, then have anything .*
but then don't end in a semicolon (or a semicolon with whitespace after it) [^;]\s*
.
^\s*(?!package|class).*?[^;]\s*$
Note that I added parentheses around a section of the regex.
Upvotes: 0
Reputation: 728
If you want a line that doesn't end in a semicolon you can ask for any amount anything .*
followed by one character that isn't a semicolon [^;]
followed possibly by some whitespace \s*
by the end of the line $
. So you have:
.*[^;]\s*$
Now if you don't want whitespace at the beginning you need to ask for the beginning of the line ^
followed by any character that isn't whitespace [^\s]
followed by the regex from earlier:
^[^\s].*[^;]\s*$
If you don't want it to start with a keyword like package
or, say, class
, or whitespace you can ask for a character that isn't any of those three things. The regex that matches any of those three things is (?:\s|package|class)
and the regex that matches anything except them them is (?!\s|package|class)
. Note the !
. So you now have:
^(?!\s|package|class).*[^;]\s*$
Upvotes: 3
Reputation: 93060
For just line that don't end in a semicolon, this is simpler:
.*[^;]$
If you don't want lines starting with whitespace and ending with semicolon:
^[^ ].*[^;]$
Upvotes: 0