Andrew
Andrew

Reputation: 7768

How to get the partial String from the string

how can I extract the numerical values between / and .html?

http://www.site.com/prod/bunch-of-seo-text-of-different-length/12345687.html
http://www.site.com/prod/bunch-of-seo-text-of-different-length/12345688.html
http://www.site.com/prod/bunch-of-seo-text-of-different-length/12345688.html

I tried this, but I don't know about the syntax

SUBSTRING(link FROM '%/%' FOR '%.html%') 

Upvotes: 1

Views: 690

Answers (4)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656251

The question is:

how can I extract the numerical values between / and .html?

And a proper answer is:

SELECT substring(link, '/(\d+)\.html')

\d .. the class shorthand for [[:digit:]] which is equivalent to [0-9]
+ .. one or more of the preceding atom
() .. parenthesis capture the match
\. .. dot has to be escaped to loose its special meaning

Test:

WITH x(link) AS (
    VALUES
     ('http://www.site.com/prod/bunch-of-text-of-different-length/12345687.html')
    ,('http://www.site.com/prod/bunch-of-text/12345688.html')
    ,('http://www.site2.com/prod/123/text-of-di456fferent-89-len/12345688.html')
    )
SELECT substring(link, '/(\d+)\.html') FROM x;

regexp_matches() is meant for capturing multiple matches and is not a good choice for this.

Upvotes: 2

user330315
user330315

Reputation:

Another option would be to use the substring function:

select substring(link from '[0-9]+')
from your_table;

This assumes that there is exactly one "number" in the URL.

Upvotes: 0

zamnuts
zamnuts

Reputation: 9582

I am not a postgresql guy, but it looks like it has some built-in regex functions. See PostgreSQL Matching Functions and find the section where it describes the regexp_matches function. regexp_matches returns an array ({}).

SELECT regexp_matches(link, '([0-9]+)\.html$');

Combine this with a unnest see PostgreSQL Arrays:

SELECT unnest(SELECT regexp_matches(link, '([0-9]+)\.html$')) as matches;

Upvotes: 1

BumbleB2na
BumbleB2na

Reputation: 10743

This snippet will give you the filename of any url:

SELECT REVERSE(SUBSTRING(REVERSE(REPLACE('http://www.somewhere.com/folder/page.html', '\', '/')), 1, CHARINDEX('/', REVERSE(REPLACE('http://www.somewhere.com/folder/page.html', '\', '/'))) - 1)) as PageName

Output:

page.html

Or, if you are passing in the url as an nvarchar variable:

SELECT REVERSE(SUBSTRING(REVERSE(REPLACE(@Url, '\', '/')), 1, CHARINDEX('/', REVERSE(REPLACE(@Url, '\', '/'))) - 1)) as PageName

Upvotes: 0

Related Questions