DarkLeafyGreen
DarkLeafyGreen

Reputation: 70406

Join two tables with same schema in bigquery

I have two tables with the same schema as input:

+---------+--------+----------------------+
|  value  |  city  |   timestamp          |
+---------+--------+----------------------+
| 50      |  LA    |  2019-02-6 03:05pm   |
| 163     |  NYC   |  2019-02-5 03:06pm   |
| 681     |  SF    |  2019-02-4 06:41pm   |
| 35      |  LA    |  2019-02-3 05:12pm   |
+---------+--------+----------------------+

The first table holds regular and the second fee payments. I want to join the tables and group them as follows:

+------------+--------+----------+--------------+
|  regular   |  fees  |   city   |  timestamp   |
+------------+--------+----------+--------------+
| 50         | 20     | LA       |  2019-02-6   |
| 163        | NULL   | NYC      |  2019-02-5   |
| 681        | ..     | SF       |  2019-02-4   |
| 35         | ..     | LA       |  2019-02-3   |
+------------+--------+----------+--------------+

There might be days with no fees collected. What I've tried:

SELECT t1.city, regular, fees, t1.day
FROM
(
  SELECT city, SUM(value) AS regular, FORMAT_TIMESTAMP("%Y-%m-%d", TIMESTAMP(timestamp)) as day
  FROM `payments`
  GROUP BY day, city
) t1
FULL JOIN (
  SELECT city, SUM(value) AS fees, FORMAT_TIMESTAMP("%Y-%m-%d", TIMESTAMP(timestamp)) as day
  FROM `fees`
  GROUP BY day, city
) t2
ON t1.day = t2.day
ORDER BY t1.day DESC

This produces the correct output schema, however it does not sum the fees correctly:

+------------+--------+----------+--------------+
|  regular   |  fees  |   city   |  timestamp   |
+------------+--------+----------+--------------+
| 26500      | 6300   | LA       |  2019-02-6   |
| 26500      | 8500   | LA       |  2019-02-6   |
| 26500      | 1000   | LA       |  2019-02-6   |
+------------+--------+----------+--------------+

As you can see I am getting for the same day and city different fee values. Any ideas what I am doing wrong here?

Upvotes: 0

Views: 268

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172954

The problem is simply in your ON clause - you JOIN on days only, but should join on days and city as in below snippet

ON t1.day = t2.day
AND t1.city = t2.city 

Upvotes: 2

Related Questions