pcct2020
pcct2020

Reputation: 71

How to find non matching records from two columns while accounting for duplicate values in Excel

I have two large columns.

Column A contains 100,000 different numbers/rows. Column B contains 100,210 numbers/rows. They have the same numbers except column B has 210 extra rows. I need to be able get the values of that extra 210 rows.

The issue im having is that the numbers in these rows are not unique. For example,

Column A contains the following numbers: 2,1,3,4,5,5,6,7

Column B contains the following numbers: 1,2,3,4,5,5,5,5,6,6,7,8

I want the outcome result to be: 5,5,6,8

I can't seem to wrap my head around a way to do this.

I have the two columns in a text file that im importing into excel. If there are better ways to do it outside of excel, I am open to it too.

Upvotes: 1

Views: 3955

Answers (3)

cool Quazi
cool Quazi

Reputation: 248

=IF(COUNTIF($B:$B, $A2)=0, "No match in B", "")

Upvotes: 0

Scott Craner
Scott Craner

Reputation: 152660

With the Dynamic Array formula Filter:

=FILTER(B1:B12,COUNTIF(OFFSET(B1,0,,SEQUENCE(ROWS(B1:B12))),B1:B12)>COUNTIF(A:A,B1:B12))

enter image description here


Without FILTER:

Put this in the first cell and copy down:

=IFERROR(INDEX(B:B,AGGREGATE(15,7,ROW(B1:B12)/(COUNTIF(OFFSET(B1,0,,ROW(INDEX($ZZ:$ZZ,1):INDEX($ZZ:$ZZ,ROWS(B1:B12)))),B1:B12)>COUNTIF(A:A,B1:B12)),ROW($ZZ1))),"")

enter image description here

Upvotes: 1

Panos
Panos

Reputation: 609

Try to follow these steps, supposing that Column A has less values than the Column B and the rows start at 1:

A. Create Column C.

  • In the cell C1 place the function: =COUNTIF(A:A;B1)
  • Copy this function to the rest of cells, for all items of Column B. So, cell C2 will have the function =COUNTIF(A:A;B2) and so on.

B. Create column D.

  • In the cell D1 place the function: =COUNTIF($B1:$B1;B1)
  • Copy this function to the rest of cells, for all items of Column B. So, cell D2 will have the function =COUNTIF($B$1:$B2;B2) and so on.

C. Create column E.

  • In the cell E1 place the function: =IF(D1<=C1,"Exists","Missing")
  • Copy this function to the rest of cells, for all items of Column B. So, cell E2 will have the function =IF(D2<=C2,"Exists","Missing") and so on.

D. Filter to show only the rows that Column E values are "Missing".

Of course you can combine all above 3 columns to one (e.g. in Column F), so these cells will have the functions:

  • F1: =IF(COUNTIF($B$1:$B1,B1)<=COUNTIF(A:A,B1),"Exists","Missing")
  • F2: =IF(COUNTIF($B$1:$B2,B2)<=COUNTIF(A:A,B2),"Exists","Missing")
  • and so on

Explanation:

  • In column C we count how many times the value of the respective cell of Column B exist in the whole Column A.
  • In Column D we count how many times we have "met" this value in Column B so far.
  • In Column E we check if we have "met" the value more times that it exists in Column A. If indeed we have "met" it more times, then we mark the cell as "missing"

Tested with the example you provided and works okay.

I hope it helps! Good luck!

EDIT - Addition of Screenshot

enter image description here

Upvotes: 0

Related Questions