Xonoa
Xonoa

Reputation: 13

Is there an Excel formula that produces a dynamic spill array that displays the sorted and filtered dates from a Table?

I have a Table with columns USER, CATEGORY, POINTS, and DATE called main_data. There may be duplicate values in any of the columns, and may be rows that are entirely identical to each other.

USER CATEGORY POINTS DATE
JDOE Cat1 0.5 7/23/2023
JDOE Cat1 0.5 11/23/2022
JDOE Cat3 1 5/20/2023
JSCHMO Cat2 0.5 7/23/2023
TEST Cat2 0.5 7/23/2023
JSCHMO Cat4 1 1/14/2023
JSCHMO Cat4 3 7/23/2023
JSCHMO Cat2 0.5 7/23/2023
TEST Cat2 0.5 1/1/2022
TEST Cat4 7 1/1/2023

I have a formula to display the total points for each user for each category (cell B1 in the table below):

=SUMIFS(main_data[POINTS], main_data[USER], A2#, main_data[CATEGORY], B1#, main_data[DATE], ">" & TODAY() - 365)

where A2 contains a spill range for USERs and B1 contains a spill range for CATEGORYs.

I'd like to have a formula that will spill downward and show a sorted list of the dates for the USER in that row. So row 2 would show all the dates in descending order for the USER in cell A2, row 3 would show for USER in A3, etc.

A B C D E F G H I
1 USER Cat1 Cat2 Cat3 Cat4
2 JDOE 1 0 1 0 11/23/2022 5/20/2023 7/23/2023
3 JSCHMO 0 1 0 4 1/14/2023 7/23/2023 7/23/2023 7/23/2023
4 TUSER 0 0.5 0 7 1/1/2023 7/23/2023

I can do this with a manually drag copied formula:

=TRANSPOSE( SORT( FILTER(main_data[DATE], ( (TODAY()-main_data[DATE]) < 365 ) * ( main_data[USER] = A2 ) ),, 1 ) )

and I can produce an array that has each row in main_data with the date broken out into a column based on USER:

=IF(TRANSPOSE(main_data[USER]) = UNIQUE(main_data[USER]), TRANSPOSE(main_data[DATE])*SEQUENCE(ROWS(UNIQUE(main_data[USER])),1,1,0),"")

but I don't know how to condense that down. If I could sort each row individually that would work, but my version of Excel doesn't have the latest round of updated formulas, so I can't use BYROW, LAMBDA, TEXTSPLIT, HSTACK, VSTACK, TOROW, REDUCE, etc. I have Version 2202 Build 14931.21040

Without condensation (and sorting) the number of columns will expand as the number of USERs does, and it will be full of blank cells.

Is it possible to do it with a dynamic array so that I wouldn't have to periodically drag-copy the formula downward as the number of USERs expands? I cannot use VBA as the workbook has to be stored on Sharepoint which does not allow VBA to run in online mode.

Upvotes: 0

Views: 712

Answers (3)

Xonoa
Xonoa

Reputation: 13

This solution will work mostly, though it is slightly incomplete. It is also very long and convoluted. It does have one benefit over the solution given by P.b: their solution will make an array that is as wide as all the dates in the records, so if there are more than 16384 records then we might have issues. Mine only makes an array that is 56 columns wide, to account for the max of 14 records in each of the 4 categories.

=LET(
    num_users, COUNTA(UNIQUE(main_data[USER])),
    users_count_seq, SEQUENCE(num_users),
    user_date_lengths, COUNTIF(main_data[USER], UNIQUE(main_data[USER])),
    counting_sequencer_index_offsets, MMULT(--(users_count_seq>=TRANSPOSE(users_count_seq)),user_date_lengths)-user_date_lengths,
    rept_string, TEXTJOIN("", FALSE, REPT(TEXT(counting_sequencer_index_offsets,"0")&",",user_date_lengths)),
    num_dates, SUM(user_date_lengths),
    num_dates_seq, SEQUENCE(num_dates),
    rept_seq_start, "<nums><n>",
    rept_seq_end, "/nums>",
    subbed_csv_str, SUBSTITUTE(rept_string,",","</n><n>"),
    trimmed_csv_str, LEFT(subbed_csv_str,LEN(subbed_csv_str)-2),
    offsets_xml, rept_seq_start & trimmed_csv_str & rept_seq_end,
    parsed_offsets_xml, FILTERXML(offsets_xml,"//n"),
    stacked_seq, num_dates_seq-parsed_offsets_xml,
    sorted_dates, SORT(SORT(main_data, 4), 1),
    sequenced_dates, INDEX(sorted_dates, SEQUENCE(ROWS(sorted_dates)),4),
    three_each_num_dates_seq, FLOOR.MATH(SEQUENCE(num_dates*3,,1.333,0.333)),
    three_each_length, COUNTA(three_each_num_dates_seq),
    letter_switch, MOD(SEQUENCE(three_each_length),3)=2,
    interleave, IF(letter_switch,
        INDEX(sequenced_dates, three_each_num_dates_seq),
        INDEX(stacked_seq, three_each_num_dates_seq)),
    date_add_start, "<date",
    date_add_end, ">",
    date_add_statics, {">","</date","><date"},
    dates_add_xml, date_add_start & TEXTJOIN(date_add_statics, FALSE, interleave) & date_add_end,
    pre_user_xml_subbed, SUBSTITUTE(SUBSTITUTE(dates_add_xml, "<", "+"), ">", "^"),
    pre_user_xml_splits, SUBSTITUTE(pre_user_xml_subbed, "+date1^", "</split><split>"),
    pre_user_xml_trimmed, RIGHT(pre_user_xml_splits, LEN(pre_user_xml_splits)-15),
    splitting_xml, "<o><split>" & pre_user_xml_trimmed & "</split></o>",
    pre_user_splits, FILTERXML(splitting_xml,"//split"),
    user_add_delim, "</user" & users_count_seq & "><user" & users_count_seq+1 & "><date1>",
    user_addition_start, "<w><user1><date1>",
    user_addition_end, "</user" & MAX(users_count_seq) & "></w>",
    users_add_xml, user_addition_start & TEXTJOIN(user_add_delim, FALSE, pre_user_splits) & user_addition_end,
    reconstructed_xml, SUBSTITUTE(SUBSTITUTE(users_add_xml, "+", "<"), "^", ">"),
    final_table, IFERROR(TRANSPOSE(FILTERXML(reconstructed_xml,"//user" & SEQUENCE(1,num_users) & "/date"&SEQUENCE(56))),""),
    final_table
)

The most interesting part of this is using FILTERXML and TEXTJOIN to sort of hack an equivalent to TEXTSPLIT, which I don't have.

The counting_sequencer_index_offsets is the number of records for the users above the user in each row. In the test data it evals to {0,3,7}

rept_string is a comma separated sequence of those values repeated a number of times equal to the number of records each user has:" "0,0,0,3,3,3,3,7,7,7,"

subbed_csv_str replaces each comma with "</n><n>" to give us "0</n><n>0</n><n>0</n><n>3</n><n>3</n><n>3</n><n>3</n><n>7</n><n>7</n><n>7</n><n>"

we have to wrap that in an outer XML tag, and that requires stripping some bits.

All of this to get an array that has the count of all the dates for the records above each date in the records: {0,0,0,3,3,3,3,7,7,7}

we subtract those from a sequence of just the record indices to get {1,2,3,1,2,3,4,1,2,3}

sorted_dates is the data sorted by user and then date to get the dates in the right order.

three_each_num_dates_seq is an array with each number up to the total count of dates repeated 3 times. This is to index the placement of the XML tag indices and the dates themselves. Each needs to be present 3 times for the opening tag, the date, and the closing tag.

The interleave variable contains this indexed data. each date is surrounded by its ordinal within its user's records. So the oldest date is flanked by 1's, the next oldest by 2's, etc. {1, 44888, 1, 2, 45066, 2, 3, ...}

we then TEXTJOIN with some XML strings to add the rest of each tag to those numbers. "<date1>44888</date1><date2>45066</date2><date3>..."

Then we replace the < and > with placeholders, because I realized that I needed to include user numbers or it would crowd all the dates to the top of the array, and any column that had blanks would be thrown off.

The rest is more applications of the same. adding in some XML tags to separate that string into parts based on where the </date1> tag was, adding in new XML tags to define the user numbers, and then returning the < and > to the string so the whole thing can be parsed by FILTERXML a value concatenated from the number of users and the max number of records.

Upvotes: 1

P.b
P.b

Reputation: 11653

=LET(a,SEQUENCE(ROWS(UNIQUE(main_data[USER]))),
     b,IF(TRANSPOSE(main_data[USER])=UNIQUE(main_data[USER]),TRANSPOSE(main_data[DATE])*a^0,""),
     c,SEQUENCE(ROWS(b)),
     s,SORT(b,c,SIGN(c),1),
     m,MMULT(--(TRANSPOSE(main_data[USER])=UNIQUE(main_data[USER])),ROW(main_data[USER])^0),
     r,SIGN(SEQUENCE(ROWS(s),COLUMNS(s)))+SEQUENCE(rows(s),,0),
     mc,MMULT((TRANSPOSE(m)+(SIGN(m)-1))*(SEQUENCE(,ROWS(m))>=SEQUENCE(ROWS(m))),SEQUENCE(ROWS(m),,,0)),
     i,MOD(SEQUENCE(rows(s),columns(s),columns(s))-mc,columns(s))+1,
INDEX(s,SEQUENCE(rows(s)),i))

I hope this formula contains only formula's you have present in your version. Without REDUCE this is quite challenging.

This formula uses your outcome b of dates matching the user ID's and sorts them s.

this results in a cascaded spill, where the first row is sorted from the beginning, the second row of the spill starts where the previous row ended etc.: enter image description here

m checks the number of dates per ID, row wise: enter image description here

mc is to check the number of dates greater than the smallest date for it's ID. enter image description here

these are used to create a sequence i to index the start column at the position of the first date of it's id to be used when indexing the sorted array s:

enter image description here

It may not be elegant, but it works.

Upvotes: 1

Ike
Ike

Reputation: 13064

You can use this formula - which will return all the information at once:

=LET(data,main_data,
users,INDEX(data,,1),categories,INDEX(data,,2),points,INDEX(data,,3),dates,INDEX(data,,4),
uniqueUsers,UNIQUE(users),
hCategories, TOROW(UNIQUE(categories)),
cntCategories,VSTACK(hCategories,SUMIFS(points,users,uniqueUsers,categories,hCategories,dates,">" & TODAY() - 365)),
hDates,IFNA(REDUCE("",uniqueUsers,LAMBDA(result,rUser,
               VSTACK(result,TOROW(TEXT(FILTER(dates,(users=rUser)*(dates>TODAY()-365),""),"dd.MM.yyyy"))))),""),
HSTACK(VSTACK("User",uniqueUsers),cntCategories,hDates)
)

Most likely you will have to update the date format according to your regional settings. Or you remove the TEXT-part and do the formatting on the sheet.

enter image description here

Upvotes: 2

Related Questions