busheriff
busheriff

Reputation: 165

Histogram and distribution in Google Data Studio

I have a large number of records showing call duration from various areas and exchange. the table has 3 key fields such as below:

Calling_number area call_duration(s)

I need to plot call distribution based duration in bucket of 1 second or 5 second with data studio.

Although there is no specific function in Data studio to create histogram with large number of bins I have seen someone had done it before.

Any help or suggestion on how to do this would be appreciated.

Upvotes: 4

Views: 4392

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173046

You can use RANGE_BUCKET mathematical function available in BigQuery. This function scans through a sorted array and returns the 0-based position of a given point's upper bound. This function can be useful if you need to group your data to build partitions, histograms, business-defined rules, and more.

Very simple example for age histogram below

WITH students AS
(
  SELECT 9 AS age UNION ALL
  SELECT 20 AS age UNION ALL
  SELECT 25 AS age UNION ALL
  SELECT 31 AS age UNION ALL
  SELECT 32 AS age UNION ALL
  SELECT 33 AS age
)
SELECT RANGE_BUCKET(age, [10, 20, 30]) AS age_group, COUNT(*) AS count
FROM students
GROUP BY 1

with output

+--------------+-------+
| age_group    | count |
+--------------+-------+
| 0            | 1     |
| 2            | 2     |
| 3            | 3     |
+--------------+-------+

Upvotes: 6

Related Questions