Reputation: 1321
For a recent project, I want to draw a graph of movement of sensors. It is quite straightforward, my data is a timestamp, and a value in mm.
Now, this value can fluctuate a bit so I already got helped here by getting a rolling median of these values, so the graph smoothes out.
Now, if I want to draw one year of movement, it consists of 100.000 datapoints. This is too much for a browser to load or draw, and it is over resolution. In a persons screen, I should be able to draw 250 points BUT give the characteristics of the graph.
Now, there are several options: first, I guess the median is always the starting point.
Makes sense. But how can I let BigQuery return say one ts and an average from each block of 100 rows?
But, if in these 100 rows, we have an obvious spike, it will be leveled out and thus not visible (interesting point of data)
So, maybe I want to see the max or min value of these 100 rows, and return that, so the client can draw a peak on that block of data.
I am a programmer, not a statistical guy, but maybe these kind of questions are not uncommon :)
Upvotes: 0
Views: 57
Reputation: 59165
Group data points by hour (or daily if too many points), chart 3 lines:
For example:
#standardSQL
SELECT SensorName, TIMESTAMP_TRUNC(Timestamp, hour) hour
, COUNT(*) c, AVG(Data) avg, MAX(Data) max, MIN(Data) min
FROM `data-sensing-lab.io_sensor_data.moscone_io13`
WHERE SensorName='XBee_40670F5F/light'
GROUP BY 1,2
ORDER BY 2
This is easy when visualizing one sensor at a time - but are you looking for a chart that can show the same for X sensors at a time?
If you add the VARIANCE
to look for times with many changes:
You might notice that it's pretty normal to find high variance spots - especially every morning in this case. So you don't really need to care about high variance, but compare with a normal trend line. And that would be the start of an interesting new question.
Upvotes: 2