Reputation: 48357
I am trying to identify errors in a log file. The application uses five uppercase letters followed by three digits followed by 'E' as an error code. The error code is followed by a non-word character. I was identifying cases with:
$errors=Select-string -Path "logfile.txt" -Pattern "[A-Z]{5}[0-9]{3}E\W"
However the remainder of the content now includes
ab1bea8a-a00e-4211-b1db-2facecfd725e.
Which is being matched by the regex and flagged as an error. I changed the regex to
\p{Lu}{5}[0-9]{3}E\W
(which I expected to match five upper case characters), but why does it still match the non-error lower case pattern?
Upvotes: 2
Views: 659
Reputation: 200293
PowerShell regular expression matching is case-insensitive by default. There are several ways for making matches case-sensitive, though.
Add the -CaseSensitive
switch when using the Select-String
cmdlet:
-CaseSensitive
Makes matches case-sensitive. By default, matches are not case-sensitive.
C:\> 'abc' | Select-String -Pattern 'A' abc C:\> 'ABC' | Select-String -Pattern 'A' ABC C:\> 'abc' | Select-String -Pattern 'A' -CaseSensitive # ← no match here C:\> 'ABC' | Select-String -Pattern 'A' -CaseSensitive ABC
Use the case-sensitive version of the regular expression matching operators:
By default, all comparison operators are case-insensitive. To make a comparison operator case-sensitive, precede the operator name with a
c
. For example, the case-sensitive version of-eq
is-ceq
. To make the case-insensitivity explicit, precede the operator with ani
. For example, the explicitly case-insensitive version of-eq
is-ieq
.
C:\> 'abc' -match 'A' True C:\> 'ABC' -match 'A' True C:\> 'abc' -cmatch 'A' # ← no match here False C:\> 'ABC' -cmatch 'A' True
Force a case-sensitive match by adding a miscellaneous construct ((?...)
, not to be confused with non-capturing groups (?:...)
) with the inverted "case-insensitive" regex option to the regular expression (this works with both Select-String
cmdlet and -match
operator):
C:\> 'abc' | Select-String -Pattern '(?-i)A' # ← no match here C:\> 'ABC' | Select-String -Pattern '(?-i)A' ABC C:\> 'abc' -match '(?-i)A' # ← no match here False C:\> 'ABC' -match '(?-i)A' True
Upvotes: 2
Reputation: 338228
The "case-insensitive" regex flag is set by Select-String
, which makes \p{Lu}
case-insensitive, just as it does with [A-Z]
.
Try adding the -CaseSensitive
parameter to the command.
You can confirm this by running some .NET code, for example in LINQPad:
(new Regex(@"\p{Lu}", RegexOptions.IgnoreCase)).IsMatch("a")
Upvotes: 4