Reputation: 1477
I have a column with following values in a postgres table.
col1
uniprotkb:P62158(protein(MI:0326), 9606 - Homo sapiens)
uniprotkb:O00602-PRO_0000009136(protein(MI:0326), 9606 - Homo sapiens)
I would like to extract a value from above column values.
col2
P62158
O00602
I am using following regexp match on my column
select
uniprotkb:(.*)\-|\([a-zA-Z].* as col2
from table;
But the above regexp capture the text before the last '-'. I want to capture the text between uniprotkb: and before the first occurence of either '(' or '-'. Any suggestion here would be helpful.
Upvotes: 1
Views: 33
Reputation: 626845
You may use
uniprotkb:(.*?)[-(][a-zA-Z]
^^^ ^^^^
See the regex demo.
Details
uniprotkb:
- a literal string(.*?)
- Group 1: any 0+ chars as few as possible[-(]
- a -
or (
[a-zA-Z]
- a letter.PostgresSQL test:
SELECT (REGEXP_MATCHES (
'uniprotkb:P62158(protein(MI:0326), 9606 - Homo sapiens)',
'uniprotkb:(.*?)[-(][a-zA-Z]'
))[1]
Outputs:
Upvotes: 1