Reputation: 1610
I have a list of Country
objects that have this design:
public class Country
{
public string Name { get; set; }
public string ISO { get; set; }
}
I need to check if in a particular list a Country
exists or not, so I did:
foreach (var date in dates)
{
var newCountries = GetCountryByDate(date);
foreach (var country in newCountries)
{
if (!countries.Exists(c => c.ISO == country.iso))
countries.Add(new Country
{
Id = country.id,
Name = country.name,
ISO = country.iso,
Link = country.id.ToString()
});
}
}
where newCountries
contains the list of Country
which need to be iterated, and countries
is a List<Country>
filtered by date, infact each date
contains a different list of Countries
.
The problem's that this mechanism is really slow
Upvotes: 0
Views: 82
Reputation: 32455
You are checking for new countries in countries
list by ISO
property.
Instead of looping the list for every new country (which is O(n2)), generate HashSet<string>
for ISO
property which will turn search algorithm into O(n).
var existedISOCodes = countries.Select(country => country.ISO).ToHashSet();
foreach (var date in dates)
{
var newCountries = GetCountryByDate(date);
foreach (var country in newCountries)
{
if (existedISOCodes.Add(country.iso))
{
countries.Add(new Country
{
Id = country.id,
Name = country.name,
ISO = country.iso,
Link = country.id.ToString()
});
}
}
}
HashSet<T>.Add
method will try to add given value to the set and return false
if value already exists.
If you are a fan of LINQ:
var existedISOCodes = countries.Select(country => country.ISO).ToHashSet();
var newCountries =
dates.SelectMany(date => GetCountryByDate(date))
.Where(country => existedISOCodes.Add(country.ISO))
.Select(country =>
{
return new Country
{
Id = country.id,
Name = country.name,
ISO = country.iso,
Link = country.id.ToString()
};
});
countries.AddRange(newCountries);
But I assume actual performance bottleneck is GetCountryByDate
method.
If this method accessing some external resources (database, webservices) and you are not able to get countries for all date in one requests, you probably can turn GetCountryByDate
into asynchronous function then you will be able to get countries for all dates almost simultaneously
var newCountryTasks = dates.Select(date => GetCountryByDateAsync(date));
await Task.WhenAll(newCountryTasks);
var newCountries =
newCountryTasks.SelectMany(task => task.Result)
.Where(country => existedISOCodes.Add(country.ISO))
.Select(country =>
{
return new Country
{
Id = country.id,
Name = country.name,
ISO = country.iso,
Link = country.id.ToString()
};
});
countries.AddRange(newCountries);
Upvotes: 3