user1837608
user1837608

Reputation: 960

How to get average and standard deviation of grouped data?

I'm trying to work with a database using Google Sheets.

I am able to access the database using the importJson script from here:https://github.com/bradjasper/ImportJSON

using the query: =importJson("https://mywebsite.com/json-data");

It populates my Google Sheets just fine like this mockup:

+-----+--------+---------+
| Age | Weight | Country |
+-----+--------+---------+
| 10  | 120    | US      |
+-----+--------+---------+
| 10  | 140    | US      |
+-----+--------+---------+
| 10  | 160    | US      |
+-----+--------+---------+
| 11  | 180    | CA      |
+-----+--------+---------+
| 12  | 190    | GBR     |
+-----+--------+---------+
| 12  | 200    | GBR     |
+-----+--------+---------+

I'd like to be able to show the data grouped by age, with an average weight for that age, standard deviation for the weights in that age, like so...

+-----+----------------+----------+
| Age | Average weight | Std. Dev |
+-----+----------------+----------+
| 10  | 140            | 20       |
+-----+----------------+----------+
| 11  | 180            | 0        |
+-----+----------------+----------+
| 12  | 195            | 5        |
+-----+----------------+----------+

I've tried to play around with creating a pivot table in sheets, but I can't figure out how to get the average of all values for each age in a column or the std deviation.

When I try to add a column with weight values by average, I get a DIV/0 error.

Is this doable in Sheets?

Upvotes: 1

Views: 1779

Answers (1)

player0
player0

Reputation: 1

0

paste in E2 cell:

=QUERY(A2:B, 
 "select A,avg(B) 
  where A is not null 
  group by A 
  label avg(B)''", 0)

paste in G2 cell and drag down:

=IFERROR(STDEV(FILTER(B:B, A:A=E2)), 0)

paste in H2 cell and drag down:

=IFERROR(STDEVP(FILTER(B:B, A:A=E2)), 0)

UPDATE:

STDEV and STDEVP are now supported under array processing

=BYROW(E2:E4, LAMBDA(xx, IFERROR(STDEV(FILTER(B:B, A:A=xx)), 0)))

enter image description here

all-in-one formula solution:

={QUERY(A2:B, 
 "select A,avg(B) where A is not null 
  group by A label avg(B)''", 0), 
 BYROW(UNIQUE(FILTER(A2:A, A2:A<>"")), 
 LAMBDA(xx, IFERROR(STDEV(  FILTER(B:B, A:A=xx)), 0))), 
 BYROW(UNIQUE(FILTER(A2:A, A2:A<>"")), 
 LAMBDA(xx, IFERROR(STDEV.P(FILTER(B:B, A:A=xx)), 0)))}

enter image description here

Upvotes: 0

Related Questions