Collections manipulation, need help optimizing this code from a report generator

Question

I'm creating a report generating tool that use custom data type of different sources from our system. The user can create a report schema and depending on what asked, the data get associated based different index keys, time, time ranges, etc. The project is NOT doing queries in a relational database, it's pure C# code in collections from RAM.

I'm having a huge performance issue and I'm looking at my code since a few days and struggle with trying to optimize it.

I stripped down the code to the minimum for a short example of what the profiler point as the problematic algorithm, but the real version is a bit more complex with more conditions and working with dates.

In short, this function return a subset of "values" satisfying the conditions depending on the keys of the values that were selected from the "index rows".

private List GetAssociatedValues(IReadOnlyCollection> indexRows, List values)
{
    var checkContainers = ((ValueColumn.LinkKeys & ReportLinkKeys.ContainerId) > 0 &&
                           values.Any(t => t.ContainerId.HasValue));

    var checkEnterpriseId = ((ValueColumn.LinkKeys & ReportLinkKeys.EnterpriseId) > 0 &&
                             values.Any(t => t.EnterpriseId.HasValue));

    var ret = new List();
    foreach (var value in values)
    {
        var valid = true;

        foreach (var index in indexRows)
        {
            // ContainerId
            var indexConservedSource = index.AsEnumerable();
            if (checkContainers && index.CheckContainer && value.ContainerId.HasValue)
            {
                indexConservedSource = indexConservedSource.Where(t => t.ContainerId.HasValue && t.ContainerId.Value == value.ContainerId.Value);
                if (!indexConservedSource.Any())
                {
                    valid = false;
                    break;
                }
            }

            //EnterpriseId
            if (checkEnterpriseId && index.CheckEnterpriseId && value.EnterpriseId.HasValue)
            {
                indexConservedSource = indexConservedSource.Where(t => t.EnterpriseId.HasValue && t.EnterpriseId.Value == value.EnterpriseId.Value);
                if (!indexConservedSource.Any())
                {
                    valid = false;
                    break;
                }
            }
        }

        if (valid)
            ret.Add(value);
    }

    return ret;
}

This works for small samples, but as soon as I have thousands of values, and 2-3 index rows with a few dozens values too, it can take hours to generate.

As you can see, I try to break as soon as a index condition fail and pass to the next value.

I could probably do everything in a single "values.Where(####).ToList()", but that condition get complex fast.

I tried generating a IQueryable around indexConservedSource but it was even worse. I tried using a Parallel.ForEach with a ConcurrentBag for "ret", and it was also slower.

What else can be done?

Collections manipulation, need help optimizing this code from a report generator

Answers (1)

Related Questions