Reputation: 2785
I have a three column sheet: Name
(ID), Length
(Timespan), Category
(Arbitrary, could be sequence). I would like to automatically fill the Category
column with values such that each category will have the same sum of the Length
column. Currently I am splitting the category column evenly using the formula =TRANSPOSE(SPLIT(JOIN(",", ARRAYFORMULA(REPT(G2:G9&",", H2))), ","))
which I have copied from this site. Since the Length
column varies by a lot I get categories that are of much different sizes.
This is example data from my sheet:
Chapter Length Due Date (Category)
Chapter 2 00:23:43 07/06/2020
Chapter 3 00:19:01 07/06/2020
Chapter 4 00:13:29 07/06/2020
Chapter 5 00:13:00 07/06/2020
Chapter 6 00:07:56 07/06/2020
Chapter 7 00:12:38 08/06/2020
Chapter 8 00:15:20 08/06/2020
Chapter 9 00:23:51 08/06/2020
Chapter 10 00:29:40 08/06/2020
Chapter 11 00:23:37 08/06/2020
Chapter 12 00:15:39 09/06/2020
Chapter 13 00:27:07 09/06/2020
Chapter 14 00:09:18 09/06/2020
Chapter 15 00:21:52 09/06/2020
Chapter 16 00:31:35 09/06/2020
Chapter 17 00:21:17 10/06/2020
Chapter 18 00:57:07 10/06/2020
Chapter 19 00:24:42 10/06/2020
Chapter 20 00:20:24 10/06/2020
Chapter 21 00:32:28 10/06/2020
Chapter 22 00:35:17 11/06/2020
Chapter 23 00:25:54 11/06/2020
Chapter 24 00:26:35 11/06/2020
Chapter 25 00:21:25 11/06/2020
Chapter 26 00:37:04 11/06/2020
Chapter 27 00:24:27 12/06/2020
Chapter 28 00:05:15 12/06/2020
Chapter 29 00:07:29 12/06/2020
Chapter 30 00:41:52 12/06/2020
Chapter 31 00:43:30 12/06/2020
Chapter 32 00:34:31 13/06/2020
Chapter 33 00:45:24 13/06/2020
Chapter 34 00:20:02 13/06/2020
Chapter 35 00:14:43 13/06/2020
Chapter 36 00:23:56 13/06/2020
And this is the result of the query =QUERY(A2:D56,"select D, count(D),sum(C) where D is not null group by D")
which groups by the category (Due Date) and prints the sum of times:
Due Date sum
07/06/2020 1:17:09
08/06/2020 1:45:06
09/06/2020 1:45:31
10/06/2020 2:35:58
11/06/2020 2:26:15
12/06/2020 2:02:33
13/06/2020 2:18:36
I would like for this table to have more equal sums as much as possible like this:
Due Date sum
07/06/2020 ~2:01:35
08/06/2020 ~2:01:35
09/06/2020 ~2:01:35
10/06/2020 ~2:01:35
11/06/2020 ~2:01:35
12/06/2020 ~2:01:35
13/06/2020 ~2:01:35
Upvotes: 1
Views: 1522
Reputation: 34230
You may be able to improve on this, but as a first cut I would just divide the cumulative elapsed time by the average time per day (about 2 hours, as you found) and use that to look up the corresponding date:
=ArrayFormula(to_date(vlookup(sumif(row(B2:B36),"<="&row(B2:B36),B2:B36)/(sum(B2:B36)/countunique(C2:C36)),
{sequence(countunique(C2:C36),1,0),unique(C2:C36)},2)))
I have put the cumulative time elapsed next to the date so you can see how well (or badly) the time divides equally per day:
The UK-style dates have slightly messed up my formatting, which is ironic since I am in the UK but my sheet defaults to US :-(
EDIT
I think you can improve the fit by adding half the average time per chapter to the lookup:
=ArrayFormula(to_date(vlookup((sumif(row(B2:B36),"<="&row(B2:B36),B2:B36)+average(B2:B36)/2)/(sum(B2:B36)/countunique(C2:C36)),
{sequence(countunique(C2:C36),1,0),unique(C2:C36)},2)))
You can use full-column ranges if you want to:
=ArrayFormula(filter(to_date(vlookup((sumif(row(B2:B),"<="&row(B2:B),B2:B)+average(B2:B)/2)/(sum(B2:B)/countunique(C2:C)),
{sequence(countunique(C2:C),1,0),unique(filter(C2:C,C2:C<>""))},2)),A2:A<>""))
Upvotes: 2