Reputation: 111
I have a table of data with many data repeating. I have to sort the rows by random, however, without having identical names next to each other, like shown here:
How can I do that in Excel?
Upvotes: 0
Views: 687
Reputation: 111
Thank you for all the answers and suggestions.
Unfortunately I could not adapt any of the proposed solutions to work with my real data, and I had to develop a solution. The solution is in Python because the Excel file is used as a database for a Python application.
You can see and comment the developed solution here:
Upvotes: 0
Reputation: 34180
Not sure if it's worth posting this, but I might as well share the results of my research such as it is. The problem is similar to that of re-arranging the characters in a string so that no same characters are adjacent The method is just to insert whichever one of the remaining characters (names) has the highest frequency at this point and is not the same as the previous character, then reduce its frequency once it has been used. It's fairly easy to implement this in Excel, even in Excel 2019. So if the initial frequencies are in D2:D8 for convenience using Countif:
=COUNTIF(A$2:A$8,A2)
You can use this formula in (say) F2 and pull it down:
=INDEX(A$2:A$8,MATCH(MAX((D$2:D$8-COUNTIF(F$1:F1,A$2:A$8))*(A$2:A$8<>F1)),(D$2:D$8-COUNTIF(F$1:F1,A$2:A$8))*(A$2:A$8<>F1),0))
and similarly in G2 to get the ages:
=INDEX(B$2:B$8,MATCH(MAX((D$2:D$8-COUNTIF(F$1:F1,A$2:A$8))*(A$2:A$8<>F1)),(D$2:D$8-COUNTIF(F$1:F1,A$2:A$8))*(A$2:A$8<>F1),0))
I'm fairly sure this will always produce a correct result if one is possible.
HOWEVER there is no randomness built in to this method. You can see if I extend it to more data that in the first several rows the most common name simply alternates with the other two names:
Having said that, this is a bit of a worst case scenario (a lot of duplication) and it may not look too bad with real data, so it may be worth considering this approach along with the other two methods.
Upvotes: 1
Reputation: 6749
This is just an attempt because the question might need clarification or more sample data to understand the actual scenario. The main idea is to generate a random list from the input, then distribute it evenly by names. This ensures no repetition of consecutive names, but this is not the only possible way of sorting (this problem may have multiple valid combinations), but this is a valid one. The solution is volatile (every time Excel recalculates, a new output is generated) because RANDARRAY
is volatile function.
In cell D2
, you can use the following formula:
=LET(rng, A2:B8, m, ROWS(rng), seq, SEQUENCE(m),
idx, SORTBY(seq, RANDARRAY(m,,1,m, TRUE)), rRng, INDEX(rng, idx,{1,2}),
names, INDEX(rRng,,1), nCnts, MAP(seq, LAMBDA(s, ROWS(FILTER(names,
(names=INDEX(names,s)) * (seq<=s))))), SORTBY(rRng, nCnts))
Looking at @JosWoolley approach. The generation of the random sorting can be simplified so that the resulting formula could be:
=LET(rng, A2:B8, m, ROWS(rng), seq, SEQUENCE(m), rRng,SORTBY(rng, RANDARRAY(m)),
names, TAKE(rRng,,1), nCnts, MAP(seq, LAMBDA(s, ROWS(FILTER(names,
(names=INDEX(names,s)) * (seq<=s))))), SORTBY(rRng, nCnts))
LET
function is used for easy reading and composition. The name idx
represents a random sequence of the input index positions. The name rRng
, represents the input rng
, but sorted by random. This sorting doesn't ensure consecutive names
are distinct.
In order to ensure consecutive names are not repeated, we enumerate (nCnts
) repeated names
. We use a MAP
for that. This is a similar idea provided by @cybernetic.nomad in the comment section, but adapted for an array version (we cannot use COUNTIF
because it requires a range). Finally, we use SORTBY
with input argument by_array
, the map result (nCnts
), to ensure names are evenly distributed so no consecutive names will be the same. Every time Excel recalculate you will get an output with the names distributed evenly in a different way.
Upvotes: 2
Reputation: 9052
Perfect case for a recursive LAMBDA
.
In Name Manager, define RandomSort as
=LAMBDA(ζ,
LET(
ξ, SORTBY(ζ, RANDARRAY(ROWS(ζ))),
λ, TAKE(ξ, , 1),
κ, SUMPRODUCT(N(DROP(λ, -1) = DROP(λ, 1))),
IF(κ = 0, ξ, RandomSort(ζ))
)
)
then enter
=RandomSort(A2:B8)
within the worksheet somewhere. Replace A2:B8
- which should be your data excluding the headers - as required.
If no solution is possible then you will receive a #NUM!
error. I didn't get round to adding a clause to determine whether a certain combination of names has a solution or not.
Upvotes: 3