Reputation: 103
I want to remove duplicate filenames from a list that contains:
http://www.test.com/download/imagename_A.jpg
http://www.test.com/download/imagename_B.jpg
http://www.test.com/download/imagename_C.jpg
http://fc07.test.net/fs49/f/2009/216/6/f/imagename_A.jpg
http://fc09.test.net/fs49/f/2009/195/d/8/imagename_B.jpg
I want the final list to find duplicates that have the SAME filename, where if they do, the domain.net is selected over the domain.com, resulting in this final list:
http://fc07.test.net/fs49/f/2009/216/6/f/imagename_A.jpg
http://fc09.test.net/fs49/f/2009/195/d/8/imagename_B.jpg
http://www.test.com/download/imagename_C.jpg
I suspect that this can be done with linq (I found this article - Find Duplicate in list but with criteria), but I don't know enough about linq to make it work for me.
Upvotes: 0
Views: 394
Reputation: 58
You can use string.split('/') to split the URL (after converting URL to string) by "/" then compare the file names by checking the last position of the array that is created. Then you can split the second position of the array with string.split('.') and check for .net/.com in the third position of that array.
Upvotes: 0
Reputation: 35353
var result = urls.GroupBy(url => Path.GetFileName(url))
.Select(g => g.OrderByDescending(u=>new Uri(u).DnsSafeHost.EndsWith(".net")).First())
.ToList();
Upvotes: 2