rshar
rshar

Reputation: 1477

Fetching strings before special characters from a postgres column

I have a column with following values in a postgres table.

col1
uniprotkb:P62158(protein(MI:0326), 9606 - Homo sapiens)
uniprotkb:O00602-PRO_0000009136(protein(MI:0326), 9606 - Homo sapiens)

I would like to extract a value from above column values.

col2
P62158
O00602

I am using following regexp match on my column

select 

        uniprotkb:(.*)\-|\([a-zA-Z].* as col2

from table;

But the above regexp capture the text before the last '-'. I want to capture the text between uniprotkb: and before the first occurence of either '(' or '-'. Any suggestion here would be helpful.

Upvotes: 1

Views: 33

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

You may use

uniprotkb:(.*?)[-(][a-zA-Z]
           ^^^ ^^^^

See the regex demo.

Details

  • uniprotkb: - a literal string
  • (.*?) - Group 1: any 0+ chars as few as possible
  • [-(] - a - or (
  • [a-zA-Z] - a letter.

PostgresSQL test:

SELECT (REGEXP_MATCHES (
      'uniprotkb:P62158(protein(MI:0326), 9606 - Homo sapiens)',
      'uniprotkb:(.*?)[-(][a-zA-Z]'
   ))[1]

Outputs:

enter image description here

Upvotes: 1

Related Questions