Reputation: 443
In the example below, would the correct result be to return only the headings between id 9 and 15? the closest I got was:
((^| )(FED[A-Z]*|(ESC[A-Z]*)|(AGRO[A-Z]*)|(CENT[A-Z]*)|(CTR[A-Z]*)|(INST[A-Z]*)|(EDUC[A-Z]*)|(SCI[A-Z]*)|(TEC[A-Z]*))( |$))
Note: the FED [A-Z] * argument must always be present with the combination of at least one of the terms listed. Titles containing the term UNIV [A-Z] * must return false.
ID | TEXT | CURRENT RETURN | EXPECTED RETURN |
---|---|---|---|
1 | INST NACL MATEMAT PURA | TRUE | FALSE |
2 | PARA FED UNIV, TOXICOL LAB | TRUE | FALSE |
3 | CTR, BR12020330 TAUBATE | TRUE | FALSE |
4 | VICENTE LINHARES ST | TRUE | FALSE |
5 | ALBERT EINSTEIN AVE | TRUE | FALSE |
6 | REG TECN CRIMINALIST | TRUE | FALSE |
7 | NASCIMENTO CASTRO AVE | TRUE | FALSE |
8 | SAO PAULO FED UNIV, COLL AGR SCI, DEPT CROP SCI, BOTUCATU, SP, BRAZIL | TRUE | FALSE |
9 | INST FED CIENCIA TECNOL SUL MINAS, CAMPUS POCOS DE CALDAS | TRUE | TRUE |
10 | INST FED EDUC PERNAMBUCO IFPE, RECIFE | TRUE | TRUE |
11 | INST FED MINAS GERAIS, CAMPUS CONGONHAS | TRUE | TRUE |
12 | INST FED ECN TECNOL NORTE MINAS | TRUE | TRUE |
13 | CTR FED EDUC TECNOL OURO PRETO | TRUE | TRUE |
14 | FED INST AMPUS GARANHUNS | TRUE | TRUE |
15 | INST FED PERNAMBUCO BRASIL | FALSE | TRUE |
More exemples: https://regex101.com/r/reVgya/1
Upvotes: 1
Views: 90
Reputation: 627609
You can use
WHERE col RLIKE '^(?!.*\\bUNIV\\b)(?=.*\\bFED\\b).*\\b(FED|ESC|AGRO|CIE|CENT|CTR|CRT|INST|EDUC|SCI|TEC)\\b'
Here is the regex demo.
Details:
^
- start of string(?!.*\bUNIV\b)
- a negative lookahead check: there must be NO UNIV
whole word in the string(?=.*\bFED\b)
- a positive lookahead check: there must be a FED
whole word in the string.*\b(FED|ESC|AGRO|CIE|CENT|CTR|CRT|INST|EDUC|SCI|TEC)\b
- after any zero or more chars other than line break chars, there must be a whole word from the list of alternations.Upvotes: 1