Teradata regular expressions, 0 or 1 spaces

Question

In Teradata, I'm looking for one regular expression pattern that would allow me to find a pattern of some numbers, then a space or maybe no space, and then 'SF'. It should return 7 in both cases below:

SELECT
REGEXP_INSTR('12345 1000SF', pattern),
REGEXP_INSTR('12345 1000 SF', pattern)

Or, my actual goal is to extract the 1000 in both cases if there's an easier way, probably using REGEXP_SUBSTR. More details are below if you need them.

I have a column that contains free text and I would like to extract the square footage. But, in some cases, there is a space between the number and 'SF' and in some cases there is not:

'other stuff 1000 SF'
'other stuff 1000SF'

I am trying to use the REGEXP_INSTR function to find the starting position. Through google, I have found the pattern for the first to be

'([0-9])+ SF'

When I try the pattern for the second, I try

'([0-9])+SF'

and I get the error

SELECT Failed.  [2662] SUBSTR: string subscript out of bounds

I've also found an answer to a similar questions, but they don't work for Teradata. For example, I don't think you can use ? in Teradata.

dnoeth · Accepted Answer

The error message indicates you're using SUBSTR, not REGEXP_SUBSTR.

Try this:

RegExp_Substr(col, '[0-9]*(?= {0,1}SF)')

Find multiple digits followed by a single optional blank followed by SF and extract those digits.

Teradata regular expressions, 0 or 1 spaces

Answers (2)

Related Questions