Reputation: 31
I have two columns each containing a list of URLs (over 50,000), some of which are duplicates. A screenshot with simplified URLs: http://tinypic.com/r/nevmzn/8.
Note that the first row has two URLs that are slightly different but each contain the same domain: www.apple.com
and http://www.apple.com
direct to the same site, therefore the third column should reflect that.
Is there a way where I can set it up where the third column (both Columns) will automatically sort which URLs are duplicates and which aren't?
Is there a formula I could use, or an Excel feature that will help me automatically do this?
Upvotes: 2
Views: 69
Reputation: 35990
If you just want to strip the http:// part from the comparison, you can use
=IF(SUBSTITUTE(A1,"http://","")=SUBSTITUTE(B1,"http://",""),SUBSTITUTE(B1,"http://",""),"no")
Upvotes: 2
Reputation: 8670
You can use =IF(RIGHT(A1,LEN(A1)-FIND(".",A1))=RIGHT(B1,LEN(B1)-FIND(".",B1)),RIGHT(A1,LEN(A1)-FIND(".",A1)),"No")
=RIGHT(A1,LEN(A1)-FIND(".",A1))
will return website.com
from a cell even if it starts with http://
or http://www.
or www.
The above formula just compares the website from each column and if it is a match, it will return the website, otherwise return "No"
Upvotes: 0