Chris L
Chris L

Reputation: 103

Remove duplicates from list using criteria

I want to remove duplicate filenames from a list that contains:

http://www.test.com/download/imagename_A.jpg
http://www.test.com/download/imagename_B.jpg
http://www.test.com/download/imagename_C.jpg
http://fc07.test.net/fs49/f/2009/216/6/f/imagename_A.jpg
http://fc09.test.net/fs49/f/2009/195/d/8/imagename_B.jpg

I want the final list to find duplicates that have the SAME filename, where if they do, the domain.net is selected over the domain.com, resulting in this final list:

http://fc07.test.net/fs49/f/2009/216/6/f/imagename_A.jpg
http://fc09.test.net/fs49/f/2009/195/d/8/imagename_B.jpg
http://www.test.com/download/imagename_C.jpg 

I suspect that this can be done with linq (I found this article - Find Duplicate in list but with criteria), but I don't know enough about linq to make it work for me.

Upvotes: 0

Views: 394

Answers (2)

Wheater
Wheater

Reputation: 58

You can use string.split('/') to split the URL (after converting URL to string) by "/" then compare the file names by checking the last position of the array that is created. Then you can split the second position of the array with string.split('.') and check for .net/.com in the third position of that array.

Upvotes: 0

I4V
I4V

Reputation: 35353

var result = urls.GroupBy(url => Path.GetFileName(url))
                .Select(g => g.OrderByDescending(u=>new Uri(u).DnsSafeHost.EndsWith(".net")).First())
                .ToList();

Upvotes: 2

Related Questions