Reputation: 141
I have a large list of dates across multiple years. What I want to be able to do is use QUERY to find the percentage distribution across each individual year.
I have no issue getting the counts in the query structured exactly as I want, but I haven't been able to find a way to get the Query to calculate the percentage.
=QUERY(ARRAYFORMULA({YEAR($A$2:$A),MONTH($A$2:$A)}),"select Col1,count(Col1) where Col1 is not null group by Col1 pivot Col2",0)
This produces a table that looks like this:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2021 | 128 | 215 | 100 | 215 | 26 | 15 | 281 | 182 | 115 | 210 | 111 | 103 |
2022 | 215 | 476 | 815 | 121 | 621 | 82 | 121 | 635 | 418 | 574 | 922 | 41 |
What I would like to see is this:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2021 | 7.5% | 12.6% | 5.9% | 12.6% | 1.5% | .9% | 16.5% | 10.7% | 6.8% | 12.4% | 6.5% | 6.1% |
2022 | 4.3% | 9.4% | 16.2% | 2.4% | 12.3% | 1.6% | 2.4% | 12.6% | 8.3% | 11.4% | 18.3% | .8% |
Each row (year) sums to 100%. I cannot find a way to either include something within the data section of the query or within the query section to divide those individual counts by the total count for the year. I've tried nesting some queries but that either doesn't work, or I haven't been able to find the correct combination of queries.
Upvotes: 1
Views: 345
Reputation: 333
Use the following formula and format the appropriate month columns as percent:
=QUERY({
QUERY(A2:A,"select year(A), count(A) where A is not null group by year(A) pivot month(A)",0),
QUERY(A2:A,"select count(A) where A is not null group by year(A)",0)},
"select Col1, Col2/Col14, Col3/Col14, Col4/Col14, Col5/Col14, Col6/Col14, Col7/Col14, Col8/Col14, Col9/Col14, Col10/Col14, Col11/Col14, Col12/Col14, Col13/Col14
label Col1 'year', Col2/Col14 '1', Col3/Col14 '2', Col4/Col14 '3', Col5/Col14 '4', Col6/Col14 '5', Col7/Col14 '6', Col8/Col14 '7', Col9/Col14 '8', Col10/Col14 '9', Col11/Col14 '10', Col12/Col14 '11', Col13/Col14 '12'",1)
The first query is a simplified version of what you did, it returns identical data. Query supports builtin scalar functions year()
and month()
, so no need for the ARRAYFORMULA({YEAR(),MONTH()})
that you used.
The second query simply creates a column with total counts in each year.
The third query takes the result of the two previous queries as input, and divides counts for each month (Col2
– Col13
) by the total count for the year (Col14
).
Upvotes: 1