jswtraveler
jswtraveler

Reputation: 355

How to check same combination of values regardless of their order

I've been given a long list of combinations of different equipment that customers might have. It looks like what the person who made the list did was just concatenate all the columns that contain equipment product numbers. The problem is that their is no real order in how the columns are filled out. So a customer who got equipment B first and then equipment A would now have a value of B/A while another customer could also have a value of A/B. I'm trying to use either excel or SAS to figure out how many distinct combinations there are REGARDLESS of the order so in this case A/B=b/A and their would only be 1, not 2 combinations. Is there a way to figure this out using EXCEL or SAS because right now I'm having to eyeball it. As always thank you so so much for your help!!

Sandy

Upvotes: 1

Views: 3472

Answers (5)

catpol
catpol

Reputation: 68

This is quick way of providing all unique combinations of equipment in Excel without using VBA.

Next to your combinations text insert columns for each of the equipment pieces - one column per piece of equipment - see the draft below. Each column returns TRUE if the equipment combination contains this column's equipment and FALSE otherwise.

EQUIPMENT     CONTAINS_A  CONTAINS_B  CONTAINS_C  CONTAINS_ D   Sort_Column
COMBINATION
-------------------------------------------------------------------------------------------
A/B/C         TRUE       TRUE        TRUE          FALSE        TRUETRUETRUEFALSE
B/A/D/C       TRUE       TRUE        TRUE          TRUE         TRUETRUETRUETRUE
A/B           TRUE       TRUE        FALSE         FALSE        TRUETRUEFALSEFALSE
C             FALSE      FALSE       FALSE         TRUE         FALSEFALSETRUEFALSE

The last column is combining all the TRUE/FALSE values in a text. This text can be filtered and/or sorted to easily identify the unique combinations. If you apply Auto Filter the entries in the AutoFilter will show unique combinations.

These are the formulas I have used behind the values (showing formulas in row2):

CONTAINS A: =NOT(ISERROR(SEARCH("A",F2,1)))
CONTAINS B: =NOT(ISERROR(SEARCH("B",F2,1)))
CONTAINS C: =NOT(ISERROR(SEARCH("C",F2,1)))
CONTAINS D: =NOT(ISERROR(SEARCH("D",F2,1)))

Sort Column: =TEXT(H2,"#") & TEXT(I2,"#") & TEXT(J2,"#") & TEXT(K2,"#")

Sorry about the format of this but I am not allowed to post images :-( Hope it helps!

Upvotes: 1

Jim
Jim

Reputation: 138

I've had time to put this together for you as an Excel formula. See my original comments at the foot of this posting for links to my sources.

Steps to resolve this:

User Defined Function to sort a cell

Create a new module in Excel VBA (let me know if you need instructions on this, it's only a couple of clicks)

Copy and paste all the code below.

Option Explicit

Const c_Separator = "/"
' User Defined Function to split a list within a cell and then sort it 
' before recreating a sorted list
Public Function CellSort(strString As String) As String
    Dim i As Integer
    Dim arr As Variant
    Dim strRet As String
    arr = Split(strString, c_Separator)

    ' trim values so sort will work properly
    For i = LBound(arr) To UBound(arr)
        arr(i) = Trim(arr(i))
    Next i

    ' sort
    QuickSort arr, LBound(arr), UBound(arr)

    ' construct ordered list to return
    For i = LBound(arr) To UBound(arr) - 1
        strRet = strRet & CStr(arr(i)) & c_Separator
    Next i

    ' Attach the last item separately to avoid adding an unecessary separator
    CellSort = strRet & CStr(arr(i))

End Function


' Quick Sort function found here: 
' https://stackoverflow.com/questions/3399823/excel-how-do-i-sort-within-a-cell
Public Sub QuickSort(vArray As Variant, inLow As Long, inHi As Long)

  Dim pivot   As Variant
  Dim tmpSwap As Variant
  Dim tmpLow  As Long
  Dim tmpHi   As Long

  tmpLow = inLow
  tmpHi = inHi

  pivot = vArray((inLow + inHi) \ 2)

  While (tmpLow <= tmpHi)

     While (vArray(tmpLow) < pivot And tmpLow < inHi)
        tmpLow = tmpLow + 1
     Wend

     While (pivot < vArray(tmpHi) And tmpHi > inLow)
        tmpHi = tmpHi - 1
     Wend

     If (tmpLow <= tmpHi) Then
        tmpSwap = vArray(tmpLow)
        vArray(tmpLow) = vArray(tmpHi)
        vArray(tmpHi) = tmpSwap
        tmpLow = tmpLow + 1
        tmpHi = tmpHi - 1
     End If

  Wend

  If (inLow < tmpHi) Then QuickSort vArray, inLow, tmpHi
  If (tmpLow < inHi) Then QuickSort vArray, tmpLow, inHi

End Sub

Close the VBA editor (no more coding required).

Excel Formulas to Calculate number of Unique combinations In Excel you create three columns alongside your original column of data. I have shown these formulas in the screenshot below in the columns with coloured backgrounds. Each is explained beneath the image.

Excel formulas

Blue column: Uses the VBA function above to sort the contents of each cell in the original list, this gives a consistent list that you can count unique items on. If your original list has some instances in lowercase and others in uppercase and you need to treat these as the same then modify the formula in this column to be =CellSort(UPPER(A2))

Green column: Simple COUNTIF function (works in all recent Excel versions) that identifies the first instance of each sorted cell.

Red cell: Counts the number of times TRUE appears in the green column. This gives a count of unique entries.

And below is an example of the completed work.

Example of formulas in action


Original comments

Afraid I don't have time to test this right now but this may help you.

You might want to look at the VBA given in this answer (I haven't tried it myself).

VBA QuickSort

Then, if you need to do this in formulas, you could create a User Defined Function from this VBA to sort the values in the cell. Change the line arr = Split(ActiveCell.Text, ",") to have "/" so it will split your list.

Next use your formula in a column alongside the original data and then use something like the formulas on this page: Count occurrences of values or unique values in a data range to count unique.

Let me know if you need more help with any of the above and I will try to do it tomorrow.

Upvotes: 0

Joe
Joe

Reputation: 63424

In SAS, split them into component values and then sort the variables.

data have;
length value $50;
input value $;
datalines;
GummyBears/Skittles
Skittles/GummyBears
MandMs/Skittles
GummyBears/MandMs
Skittles/MandMs
GummyBears/MandMs
;;;;
run;

data want;
set have;
length vals1 vals2 $20;
array vals[2] $;
do _t = 1 to dim(vals);
 vals[_t]=scan(value,_t,'/');
end;
call sortc(of vals[*]);
run;

proc freq data=want;
tables vals1*vals2/list;
run;

You could also recombine them afterwards if you prefer.

Upvotes: 3

Bob Bryan
Bob Bryan

Reputation: 3847

You might try writing some VBA code which can be used in Excel to change the way that A and B are displayed in the column. For example, for B/A, it would put it in alphabetical order and change it to A/B. For A/B, since it is already in alphabetical order, it would leave it as it is. After this is done, then sort the sheet by that column. You should then see data like A/B and the next row down could also have A/B displayed in it. Next, depending on what you need done, you can add some more VBA code to combine the rows by that type of column and write that data out to a new sheet. This would provide you with a unique count of all unique sets of equipment types owned by all customers. If you wanted unique counts of customers who might own A/B/C and combine the A/B portion with other customers who own just A/B, then that would be a little more work.

Upvotes: 0

wilsjd
wilsjd

Reputation: 2258

You could use the conditional formatting to highlight duplicates option. It would still be eye balling it, but you could potentially then sort by color to see matched/unmatched equipment.

An example on how to do this can be found here: http://www.techrepublic.com/blog/window-on-windows/how-to-find-duplicates-in-excel/7347

Upvotes: 0

Related Questions