user3871
user3871

Reputation: 12664

extract all numbers in a string

How can I extract all numbers in a string?

Sample inputs:

Expected Outputs:

The following is tested with the first one, 7nr-6p:

select regexp_split_to_array('7nr-6p', '[^0-9]') AS new_volume from mytable;

Gives: {7,"","",6,""} // Why is a numeric-only match returning spaces?

select regexp_matches('7nr-6p', '[0-9]*'::text) from mytable;

Gives: {7} // Why isn't this continuing?

select regexp_matches('7nr-6p', '\d'::text) from mytable;

Gives: {7}

select NULLIF(regexp_replace('7nr-6p', '\D',',','g'), '')::text from mytable;

Gives: 7,,,6,

Upvotes: 0

Views: 787

Answers (4)

Babakness
Babakness

Reputation: 3134

Here is a more robust solution

CREATE OR REPLACE FUNCTION get_ints_from_text(TEXT) RETURNS int[] AS $$ 
  select array_remove(regexp_split_to_array($1,'[^0-9]+','i'),'')::int[]; 
$$ LANGUAGE SQL IMMUTABLE;

Example

select get_ints_from_text('7nr-6p'); -- 7,6
-- also resilient in situations like
select get_ints_from_text('-7nr--6p'); -- 7,6

Here is a link to try

http://sqlfiddle.com/#!17/c6ac7/2

I feel that wrapping this functionality into an immutable function is prudent. This is a pure function, one that will not mutate data and one that returns the same result given the same input. Immutable functions marked as "immutable" have performance benefits.

By using a function we also benefit from abstraction. There is one source to update should this functionality need to improve in the future.

For more information about immutable functions see

https://www.postgresql.org/docs/10/static/sql-createfunction.html

Upvotes: 1

Erwin Brandstetter
Erwin Brandstetter

Reputation: 659327

SELECT id, (regexp_matches(string, '\d+', 'g'))[1]::int AS nr
FROM  (
   VALUES
   (1, '7nr-6p')
 , (2, '12c-18L')
 , (3, '12nr-24L')
 , (4, '11nr-12p')
   ) tbl(id, string);

Result:

 id | nr
----+----
  1 |  7
  1 |  6
  2 | 12
  2 | 18
  3 | 12
  3 | 24
  4 | 11
  4 | 12

I wanted them in a single cell so I could extract them as needed

SELECT id, trim(regexp_replace(string, '\D+', ',', 'g'), ',') AS nrs
FROM  (
   VALUES
   (1, '7nr-6p')
 , (2, '12c-18L')
 , (3, '12nr-24L')
 , (4, '11nr-12p')
   ) tbl(id, string);

Result:

 id |  nrs
----+-------
  1 | 7,6
  2 | 12,18
  3 | 12,24
  4 | 11,12

dbfiddle here

Upvotes: 1

acdcjunior
acdcjunior

Reputation: 135872

The following query:

select regexp_split_to_array(regexp_replace('7nr-6p', '^[^0-9]*|[^0-9]*$', 'g'), '[^0-9]+')
                                                               AS new_volume from mytable;

"Trims" the prefix and suffix non-numbers and splits by the remaining non-numbers.


select regexp_matches('7nr-6p', '[0-9]*'::text) from mytable;

Gives: {7} // Why isn't this continuing?

Because without the 'g' flag, the regex stops at the first match. Add the 'g' flag:

select regexp_matches('7nr-6p', '[0-9]*'::text, 'g') from mytable;

Upvotes: 2

Dan
Dan

Reputation: 1891

You can replace all text and then split:

SELECT regexp_split_to_array(
    regexp_replace('7nr-6p', '[a-zA-Z]', '','g'),
    '[^0-9]'
)

This returns {7,6}

Upvotes: 1

Related Questions