Enesxg
Enesxg

Reputation: 127

Querying distinct timestamps from column in Google's Big Query

I am working to select data from a price database. The rows I want to query are the ones which occur every whole minute, and distinctly. So, if there's a minute that has two prices, I would rather the first price.

Here's what the data looks like this this VVV query:

SELECT price, timestamp FROM [database] WHERE stock="appl" AND second(timestamp) = 0 ORDER BY timestamp

Result:

Row price timestamp
1 0.097947 2018-02-14 03:42:00.000 UTC
2 0.09796 2018-02-14 03:43:00.000 UTC
3 0.097959 2018-02-14 03:45:00.000 UTC
4 0.097969 2018-02-14 03:46:00.000 UTC
5 0.097984 2018-02-14 03:47:00.000 UTC
6 0.097986 2018-02-14 03:47:00.000 UTC (Duplicate time ^) 7 0.097899 2018-02-14 03:48:00.000 UTC
8 0.097927 2018-02-14 03:49:00.000 UTC
9 0.097984 2018-02-14 03:50:00.000 UTC
10 0.097995 2018-02-14 03:51:00.000 UTC
11 0.097972 2018-02-14 03:52:00.000 UTC
12 0.097924 2018-02-14 03:53:00.000 UTC
13 0.097935 2018-02-14 03:54:00.000 UTC

When I add "GROUP BY price, timestamp", the data has no difference.

I want distinct timestamps. So, for this case the result should be:

Row price timestamp
1 0.097947 2018-02-14 03:42:00.000 UTC
2 0.09796 2018-02-14 03:43:00.000 UTC
3 0.097959 2018-02-14 03:45:00.000 UTC
4 0.097969 2018-02-14 03:46:00.000 UTC
5 0.097984 2018-02-14 03:47:00.000 UTC
6 0.097899 2018-02-14 03:48:00.000 UTC
7 0.097927 2018-02-14 03:49:00.000 UTC
8 0.097984 2018-02-14 03:50:00.000 UTC
9 0.097995 2018-02-14 03:51:00.000 UTC
10 0.097972 2018-02-14 03:52:00.000 UTC
11 0.097924 2018-02-14 03:53:00.000 UTC
12 0.097935 2018-02-14 03:54:00.000 UTC

Upvotes: 0

Views: 483

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1269763

There is no such thing as a "first" price, unless another column specifies that value. You can get one price per timestamp with something like this:

SELECT MIN(price), timestamp
FROM [database]
WHERE stock = 'appl' AND second(timestamp) = 0
GROUP BY timestamp;

If you do have another column with the ordering, then you can use array agg and choose the first value.

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172994

Below is for BigQuery Standard SQL (and assumes your ts field is of timestamp type)

SELECT 
  ARRAY_AGG(price ORDER BY ts LIMIT 1)[SAFE_OFFSET(0)] price,
  TIMESTAMP_TRUNC(ts, MINUTE) ts 
FROM `yourproject.yourdataset.yourtable`
WHERE stock = 'appl'
GROUP BY 2
ORDER BY 2  

Note: I use ts instead of timestamp as I prefer not using keywords as column names

Upvotes: 1

Enesxg
Enesxg

Reputation: 127

SELECT MIN(price), timestamp
FROM [database]
WHERE stock = 'appl' AND second(timestamp) = 0
GROUP BY timestamp
ORDER BY timestamp

Upvotes: 0

Related Questions