Reputation: 129
For my database, I have a list of company numbers where some of them start with two letters. I have created a regex which should eliminate these from a query and according to my tests, it should. But when executed, the result still contains the numbers with letters.
Here is my regex, which I've tested on https://www.regexpal.com
([^A-Z+|a-z+].*)
I've tested it against numerous variations such as SC08093, ZC000191 and NI232312 which shouldn't match and don't in the tests, which is fine.
My sql query looks like;
SELECT companyNumber FROM company_data
WHERE companyNumber ~ '([^A-Z+|a-z+].*)' order by companyNumber desc
To summerise, strings like SC08093 should not match as they start with letters.
I've read through the documentation for postgres but I couldn't seem to find anything regarding this. I'm not sure what I'm missing here. Thanks.
Upvotes: 1
Views: 392
Reputation: 23766
Not start with a letter could be done with
WHERE company ~ '^[^A-Za-z].*'
The first ^
marks the beginning. The [^A-Za-z]
says "no letter" (including small and capital letters).
Edit: Changed [A-z]
into the more precise [A-Za-z]
(Why is this regex allowing a caret?)
Upvotes: 2
Reputation:
Your pattern: [^A-Z+|a-z+].*
means "a string where at least some characters are not A-Z" - to extend that to the whole string you would need to use an anchored regex as shown by S-Man (the group defined with (..)
isn't really necessary btw)
I would probably use a regex that specifies want the valid pattern is and then use !~
instead.
where company !~ '^[0-9].*$'
^[0-9].*$
means "only consists of numbers" and the !~
means "does not match"
or
where not (company ~ '^[0-9].*$')
Upvotes: 2
Reputation: 627537
The ~ '([^A-Z+|a-z+].*)'
does not work because this is a [^A-Z+|a-z+].*
regex matching operation that returns true even upon a partial match (regex matching operation does not require full string match, and thus the pattern can match anywhere in the string). [^A-Z+|a-z+].*
matches a letter from A
to Z
, +,
|or a letter from
ato
z`, and then any amount of any zero or more chars, anywhere inside a string.
You may use
WHERE companyNumber NOT SIMILAR TO '[A-Za-z]{2}%'
See the online demo
Here, NOT SIMILAR TO
returns the inverse result of the SIMILAR TO
operation. This SIMILAR TO
operator accepts patterns that are almost regex patterns, but are also like regular wildcard patterns. NOT SIMILAR TO '[A-Za-z]{2}%'
means all records that start with two ASCII letters ([A-Za-z]{2}
) and having anything after (%
) are NOT returned and all others will be returned. Note that SIMILAR TO
requires a full string match, same as LIKE
.
Upvotes: 4