Bejasc
Bejasc

Reputation: 960

Grouping a List of objects by property, split into multiple lists.

Current Structure (Cannot change due to requirements of the system)

class Statement
{
    string section;
    string category;
    string statement;
}

Example Statement list

section      category     statement
1            a            apple
1            a            banana
1            b            potato
2            c            car
2            c            bus
2            d            plane

Problem

I start with a List<Statement>, and need to split them up based on Section, and then category into the following (or similar) structure

struct SectionCollection
{
    string sectionName {get{return categories[0].section;}}
    List<CategoryCollection> categories;
}

struct CategoryCollection
{
    string categoryName {get{return statements[0].category;}}
    List<Statement> statements;
}

So, from the List<Statement>, I should have a List<SectionCollection>, which inside has a List<CategoryCollection>, which inside has a List<Statement>

So in the above data example, I would have

Notes

It's not impossible that A statement or a category might be the same in a different section - these still need to belong in different SectionCollections

Attempt

My current attempt, which works until a null exception is eventually thrown on the inner for loop. This is one of those problems I've been staring at for a while now, so I know how confused this 'solution' may seem.

var sections = statements.GroupBy(x => x.section).Select(y => y.ToList()).ToList();
foreach(var section in sections)
{
     SectionCollection thisSection = new SectionCollection();
     var categories = section.GroupBy(x => x.category).Select(y => y.ToList()).ToList();

    foreach(var category in categories)
    {
        thisSection.categories.Add(new CategoryCollection({statements = category});
    }
}

Upvotes: 5

Views: 510

Answers (2)

RagtimeWilly
RagtimeWilly

Reputation: 5445

The reason you're getting a null reference error is because you're not initializing the Categories list in SectionCollection.

Changing:

SectionCollection thisSection = new SectionCollection();

To:

SectionCollection thisSection = new SectionCollection() 
{ 
    Categories = new List<CategoryCollection>() 
};

Will fix the error. You're also not capturing the result anywhere, if you update your code to the below it should work:

var sections = statements.GroupBy(x => x.Section).Select(y => y.ToList()).ToList();

var result = new List<SectionCollection>();

foreach (var section in sections)
{
    SectionCollection thisSection = new SectionCollection() { Categories = new List<CategoryCollection>() };

    var categories = section.GroupBy(x => x.Category).Select(y => y.ToList()).ToList();

    foreach (var category in categories)
    {
        thisSection.Categories.Add(new CategoryCollection { Statements = category });
    }

    result.Add(thisSection);
}

But it might be a bit cleaner to give classes proper constructors and properties and move some of the logic there:

internal class Program
{
    static void Main(string[] args)
    {
        var statements = new List<Statement>()
        {
            new Statement(1, "a", "apple"),
            new Statement(1, "a", "banana"),
            new Statement(1, "b", "potato"),
            new Statement(2, "c", "car"),
            new Statement(2, "c", "bus"),
            new Statement(2, "d", "plane")
        };

        var sectionCollections = statements
            .GroupBy(s => s.Section)
            .Select(group => new SectionCollection(group.Key, statements))
            .ToList();
    }

    public class Statement
    {
        public Statement(int section, string category, string statementName)
        {
            Section = section;
            Category = category;
            StatementName = statementName;
        }

        public int Section { get; }

        public string Category { get; }

        public string StatementName { get; }
    }

    public class SectionCollection
    {
        public SectionCollection(int sectionName, List<Statement> statements)
        {
            SectionName = sectionName;

            Categories = statements
                .Where(s => s.Section == sectionName)
                .GroupBy(s => s.Category)
                .Select(group => new CategoryCollection(group.Key, group.ToList()))
                .ToList();
        }

        public int SectionName { get; }

        public List<CategoryCollection> Categories { get; }
    }

    public class CategoryCollection
    {
        public CategoryCollection(string categoryName, List<Statement> statements)
        {
            CategoryName = categoryName;
            Statements = statements;
        }

        public string CategoryName { get; }

        public List<Statement> Statements { get; }
    }
}

You'll end up with following structure:

output structure

Upvotes: 3

MFisherKDX
MFisherKDX

Reputation: 2866

You create a new SectionCollection object with:

SectionCollection thisSection = new SectionCollection();

But you never initialize the value thisSection.categories using new -- neither in the constructor or outside explicitly.

So when you attempt to access thisSection.categories.Add in your inner loop, you generate an exception.

Upvotes: 1

Related Questions