Reputation: 11
Hi there so I have a list of items as follows - where in the Sample 1 column, Blue (I guess parent) is followed by Red (child) and the account numbers in Acct 1 should all be the same as the preceding Blue in that section (until the next Blue is listed with another acct number).
In the example below the 3th and 7th records need to be tagged/identified. Is there any easy way to do this with a formula? I have about 100K line items with this information. Thank you much!!
Sample 1 Acct
Blue 1234
Red 1234
Red 2458
Red 1234
Blue 5768
Red 5768
Red 2589
Red 5768
Upvotes: 1
Views: 198
Reputation: 34315
This is my suggestion. It looks before and after the current cell in column A to find the first cells which don't match in both directions, then takes the range between these cells. If there are at least as many cells in column B that don't match the current acct as do match it in the current range, then it is assumed that the current cell is a mistake (in the case where there are two children and their acct's don't match, then they would both be flagged).
=LET(range,INDEX(A:A,XMATCH(TRUE,A$1:A2<>A2,0,-1)+ROW(A$1)):INDEX(B:B,XMATCH(TRUE,A2:A$10<>A2,0)+ROW()-2),
sample,INDEX(range,0,1),
acct,INDEX(range,0,2),
equal,COUNTIF(acct,B2),
unequal,COUNTIF(acct,"<>"&B2),
unequal>=equal)
Unfortunately this formula doesn't work in conditional formatting, so it would mean putting it in a separate column, dragging it down and filtering on the TRUE values.
I will have a look at modifying the formula to work in CF.
This works in CF but is slow - will need to remove full-column ranges.
=LET(start,XMATCH(TRUE,A$1:A2<>A2,0,-1)+ROW(A$1),end,XMATCH(TRUE,A2:A$10<>A2,0)+ROW()-2,
sample,FILTER(A:A,(ROW(A:A)>=start)*(ROW(A:A)<=end)),
acct,FILTER(B:B,(ROW(B:B)>=start)*(ROW(B:B)<=end)),
equal,SUM(--(acct=B2)),
unequal,SUM(--(acct<>B2)),
unequal>=equal)
This I think is OK
=LET(start,XMATCH(TRUE,A$1:A2<>A2,0,-1)+1,end,XMATCH(TRUE,A2:A$10<>A2,0)+ROW()-ROW(A$1)-1,
seq,SEQUENCE(ROWS(A$1:A$10)),
sample,FILTER(A$1:A$10,(seq>=start)*(seq<=end)),
acct,FILTER(B$1:B$10,(seq>=start)*(seq<=end)),
equal,SUM(--(acct=B2)),
unequal,SUM(--(acct<>B2)),
unequal>=equal)
If I have misunderstood the whole thing and the parent and child is literally labelled as Blue or Red for the whole of sample 1, then you would just need this in conditional formatting:
=AND(A2="Red",B2<>XLOOKUP("Blue",A$1:A2,B$1:B2,,0,-1))
and this to mark each block with an increasing count
=IF(A2="Blue",C1+1,C1)
Upvotes: 1
Reputation: 280
It's more efficient to do this with a spill formula that calculates the entire column at once.
I took a screenshot to explain the formula, but all that is required is one formula in cell C2
.
Here is the unoptimised formula to create a spill column. It creates a bunch of arrays the height of your data and does operations on them to find the incorrect accounts.
=LET(
MultiLevelID, A2:A9,
Account, B2:B9,
sParent, "Blue",
sChild, "Red",
ArrayIndex, SEQUENCE(ROWS(MultiLevelID)),
ArrayBoolParent, ISNUMBER(XMATCH(MultiLevelID, sParent)),
ArrayParentIndex, IF(ArrayBoolParent=TRUE,ArrayIndex,0),
ArrayAllIndexes, XLOOKUP(ArrayIndex,ArrayParentIndex,ArrayParentIndex,,-1),
ArrayCorrectAccount, XLOOKUP(ArrayAllIndexes,ArrayIndex,Account),
ArrayBoolIncorrect, IF(Account=ArrayCorrectAccount,FALSE,TRUE),
ArrayBoolIncorrect
)
ArrayIndex
is an array that starts at 1 and keeps counting up for the height of your data. SEQUENCE
is for this exact purpose. It will be used as a 'row' reference. Note, these are rows relative to the selected data, not actual excel row numbers.
ArrayBoolParent
is an array with TRUE
for every parent and FALSE
for every child. XMATCH
is searching for the parent name, and returns the index if it is found and and error if not. ISNUMBER
makes the output TRUE
if found and FALSE
if not. Using XMATCH
lets you search for multiple criteria. For example, if "Light Blue" was also a parent, you could set sParent
equal to `{"Blue", "Light Blue"}.
ArrayParentIndex
is an array with the index for every parent and 0 for every child. I'm using an interesting application of IF
to do this. When IF
is given an array as the condition, it does an IF
statement for each item in the array. When a return value is also an array, it returns the item relating to the condition instead of the whole array.
ArrayCorrectedIndexes
is an array with all children's indexes pointing to their parent. Similar to IF
, using an array with XLOOKUP
performs the search on all items in the array separately. I'm also using one of the advanced functions of XLOOKUP
, which says if a value isn't found, return the next largest value. This means when I search for index 2, it finds nothing and returns the next smallest value, 1.
ArrayCorrectAccount
is an array with the parent account automatically assigned to each child. I used XLOOKUP
for clarity, but you could also use INDEX(Account,ArrayCorrectedIndexes)
.
ArrayBoolIncorrect
is an array that returns FALSE
if the account is correct, and TRUE
if not.
If you are looking for a summary of what needs to be fixed, only return the incorrect indexes:
FirstRow, MIN(ROW(MultiLevelID)) - 1,
FILTER(ArrayIndex + FirstRow, ArrayBoolIncorrect <> FALSE)
Lastly, if you want a dynamic table to summarize the errors, use this:
=LET(
MultiLevelID, A2:A9,
Account, B2:B9,
sParent, "Blue",
sChild, "Red",
FirstRow, MIN(ROW(MultiLevelID)) - 1,
ArrayIndex, SEQUENCE(ROWS(MultiLevelID)),
ArrayBoolParent, ISNUMBER(XMATCH(MultiLevelID, sParent)),
ArrayParentIndex, IF(ArrayBoolParent=TRUE,ArrayIndex,0),
ArrayCorrectedIndexes, XLOOKUP(ArrayIndex,ArrayParentIndex,ArrayParentIndex,,-1),
ArrayCorrectAccount, XLOOKUP(ArrayCorrectedIndexes,ArrayIndex,Account),
ArrayBoolIncorrect, IF(Account=ArrayCorrectAccount,FALSE,TRUE),
ArrayInccorrectIndexes, FILTER(ArrayIndex, ArrayBoolIncorrect <> FALSE),
ArrayInccorrectRows, ArrayInccorrectIndexes + FirstRow,
Categories, ArrayInccorrectIndexes,
Body_Instance, SEQUENCE(ROWS(Categories)),
Body_Parent, INDEX(MultiLevelID,INDEX(ArrayCorrectedIndexes, Categories)),
Body_Child, INDEX(MultiLevelID, Categories),
Body_IncorrectAcct, INDEX(Account, Categories),
Body_CorrectAcct, INDEX(ArrayCorrectAccount, Categories),
Body_IncorrectRows, ArrayInccorrectRows,
Total_Rows, COUNT(Categories),
Array_Seq, {1,2,3,4,5,6},
Array_Header, CHOOSE( Array_Seq, "Instance", "Row Location", "Parent", "Child", "Correct Acct", "Incorrect Acct", ),
Array_Body, CHOOSE( Array_Seq, Body_Instance, Body_IncorrectRows, Body_Parent, Body_Child, Body_CorrectAcct, Body_IncorrectAcct, ),
Array_Total, CHOOSE( Array_Seq, "", "", "", "", "Count", Total_Rows),
Range1,Array_Header,
Range2,Array_Body,
Range3,Array_Total,
Rows1,ROWS(Range1), Rows2,ROWS(Range2), Rows3,ROWS(Range3), Cols1,COLUMNS(Range1),
RowIndex, SEQUENCE(Rows1 + Rows2 + Rows3), ColIndex,SEQUENCE(1, Cols1),
RangeTable,IF(
RowIndex<=Rows1,
INDEX(Range1,RowIndex,ColIndex),
IF(RowIndex<=Rows1+Rows2,
INDEX(Range2,RowIndex-Rows1,ColIndex),
INDEX(Range3,RowIndex-Rows1-Rows2,ColIndex)
)),
Return, RangeTable,
Return
)
Upvotes: 0