AmirCS
AmirCS

Reputation: 331

Overlap Condition - Inside a Field

If I have a table like ID, IntrvalStartPoints, IntervalEndPoints
Where StartPoints contains the start of intervals and the EndPoints is the end of the intervals.

For instance:

ID: 1000
StartPoints: 94994731,94997876,94998645,95001520,95005812,95007092, ENDPoints: 94996152,94998036,94998824,95001720,95005924,95007413,

Here we have 6 intervals <94994731,94996152>, <94997876,94998036>, ...

Can we write a query to check whether for example ID:1000, Start:95005812, End:95005815 overlaps with any intervals or not.

Thanks!

Upvotes: 1

Views: 67

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173046

Below is for BigQuery Standard SQL

#standardSQL
SELECT t.id, StartPoint, EndPoint, interval_start, interval_end
FROM (
  SELECT id, CAST(StartPoint AS INT64) StartPoint, CAST(EndPoint AS INT64) EndPoint
  FROM `project.dataset.intervals` t,
  UNNEST(SPLIT(StartPoints)) StartPoint WITH OFFSET pos1
  JOIN UNNEST(SPLIT(EndPoints)) EndPoint WITH OFFSET pos2
  ON pos1 = pos2
) t
JOIN `project.dataset.checks` c ON c.id = t.id AND 
(interval_start BETWEEN StartPoint AND EndPoint 
OR interval_end BETWEEN StartPoint AND EndPoint)  

you can test / play with it using dummy data from your question as below

#standardSQL
WITH `project.dataset.intervals` AS (
  SELECT 1000 id, 
    '94994731,94997876,94998645,95001520,95005812,95007092' StartPoints,
    '94996152,94998036,94998824,95001720,95005924,95007413' EndPoints
  UNION ALL
  SELECT 2000 id, 
    '74994731' StartPoints,
    '74996152' EndPoints
), `project.dataset.checks` AS (
  SELECT 1000 id, 95005812 interval_start, 95005815 interval_end
)
SELECT t.id, StartPoint, EndPoint, interval_start, interval_end
FROM (
  SELECT id, CAST(StartPoint AS INT64) StartPoint, CAST(EndPoint AS INT64) EndPoint
  FROM `project.dataset.intervals` t,
  UNNEST(SPLIT(StartPoints)) StartPoint WITH OFFSET pos1
  JOIN UNNEST(SPLIT(EndPoints)) EndPoint WITH OFFSET pos2
  ON pos1 = pos2
) t
JOIN `project.dataset.checks` c ON c.id = t.id AND 
(interval_start BETWEEN StartPoint AND EndPoint 
OR interval_end BETWEEN StartPoint AND EndPoint)   

with result as

Row id      StartPoint  EndPoint    interval_start  interval_end     
1   1000    95005812    95005924    95005812        95005815     

Upvotes: 2

Related Questions