Reputation: 99
I need to extract 2 domains from URl. For example, the web-site name is domain.domain2.com -> domain.domain2.com; but m.domain.domain2.com -> domain.domain2.com; also domain.domain2.com/london -> domain.domain2.com.
I have written an expression ^.+[\\./]([^\\.]+\\.[^\\.]+)$
but it always returns domain2.com.
What should I add to get the desired result?
Upvotes: 1
Views: 1360
Reputation: 172974
I want to extract string that contains '.com' thing and 2 (only!) next to '.com' domain names divided by dots
Below example for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'domain.domain2.com' url UNION ALL
SELECT 'm.domain.domain2.com' UNION ALL
SELECT 'domain.domain2.com/london'
)
SELECT *,
REGEXP_EXTRACT(url, r'([^.]+.[^.]+2.com)(?:/|$)') domain_and_subdomain
FROM `project.dataset.table`
with output
Row url domain_and_subdomain
1 domain.domain2.com domain.domain2.com
2 m.domain.domain2.com domain.domain2.com
3 domain.domain2.com/london domain.domain2.com
Upvotes: 2
Reputation: 207838
Please use the NET functions
select NET.HOST('http://m.domain.domain2.com/london')
returns:
m.domain.domain2.com
Upvotes: 1