user8675309
user8675309

Reputation: 181

Finding the closest geographic points between two tables BigQuery

I have two tables with latitude and longitude points. I would like to create a new table which has information from both tables based on finding the closest points between tables. This is similar to a question previously asked; however one of the tables has arrays. The solution from the previously asked question did not seem to work with arrays.

Table A

|--------|-------------|-------------|-------------|
|   id   |  latitude   |  longitude  |   address   |
|--------|-------------|-------------|-------------|
|   1    |     39.79   |     86.03   | 123 Vine St |
|--------|-------------|-------------|-------------|
|   2    |     39.89   |     84.01   | 123 Oak St  |
|--------|-------------|-------------|-------------|

Table B

|-------------|-------------|-------------|--------------|
|  latitude   |  longitude  |  parameter1 |  parameter2  |
|-------------|-------------|-------------|--------------|
|     39.74   |     86.33   |  [1, 2, 3]  | [.1, .2, .3] |
|-------------|-------------|-------------|--------------|
|     39.81   |     83.90   |  [4, 5, 6]  | [.4, .5, .6] |
|-------------|-------------|-------------|--------------|

I would like to create a new table, Table C, which has all the rows from TABLE A and adds the information from Table B. The information from Table B is added based on the closest point in Table B to the particular row in Table A.

Table C

|------|-------------|-------------|--------------|
| id_A |   address   |  parameter1 |  parameter2  |
|------|-------------|-------------|--------------|
|  1   | 123 Vine St |  [1, 2, 3]  | [.1, .2, .3] |
|------|-------------|-------------|--------------|
|  2   |  123 Oak St |  [4, 5, 6]  | [.4, .5, .6] |
|------|-------------|-------------|--------------|

Thank you in advance!

Upvotes: 0

Views: 1171

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

Below is for BigQuery Standard SQL

#standardSQL
SELECT AS VALUE 
  ARRAY_AGG(STRUCT(id, address, parameter1, parameter2) ORDER BY ST_DISTANCE(a.point, b.point) LIMIT 1)[OFFSET(0)]
FROM (SELECT *, ST_GEOGPOINT(longitude, latitude) point FROM `project.dataset.tableA`) a,
(SELECT *, ST_GEOGPOINT(longitude, latitude) point FROM `project.dataset.tableB`) b
GROUP BY id

If to apply to sample data from your question

WITH `project.dataset.tableA` AS (
  SELECT 1 id, 39.79 latitude, 86.03 longitude, '123 Vine St' address UNION ALL
  SELECT 2, 39.89, 84.01, '123 Oak St' 
), `project.dataset.tableB` AS (
  SELECT 39.74 latitude, 86.33 longitude, [1, 2, 3] parameter1, [.1, .2, .3] parameter2 UNION ALL
  SELECT 39.81, 83.90, [4, 5, 6], [.4, .5, .6] 
)

output is

enter image description here

Upvotes: 2

Related Questions