Igorek
Igorek

Reputation: 15850

Lazy loading in complex RavenDb objects

Looks like we're dealing with a poorly thought out object design issue that is now manifesting itself into significant performance/memory issues.

We have thousands of root aggregate objects stored in a RavenDb database. For certain large customers, these objects are becoming too large to effectively perform web operations (opening pages, saving data, etc).

Structure is as follows: Account object is the aggregate root Underneath it, there is a plethora of smaller objects and collections that are all "fine" in size, except one collection called Resources that can grow very large and can cause the root objects to be multiple megabytes in size. This causes basic CRUD operations on Account and its internal data to perform very slow

Objects in the Resource collection are not huge themselves, but they have children of their own and those drag the size way up. Each Resource object has Metrics, Actions, Alerts, Scaling, and other collections that are "heavy"

Our codebase is super complex with hundreds of thousands of lines of code; and hundreds if not thousands of lines of code reference the Resource collection and inspect Resource objects within it, but access to the underlying child collections of each Resource object appears to be infrequent and mostly done one resource at a time

Question: How do we load the Account object, all of its miscellaneous children and objects, and only the first level of Resource objects, and then lazy-load sub-children of Resources? (there are like 7 specific collections that can be lazy-loaded)

We have a single Repository that is responsible for loading/saving of the data

Upvotes: 2

Views: 326

Answers (2)

Judah Gabriel Himango
Judah Gabriel Himango

Reputation: 60001

How do we load the Account object, all of its miscellaneous children and objects, and only the first level of Resource objects, and then lazy-load sub-children of Resources? (there are like 7 specific collections that can be lazy-loaded)

OK, my other answer is the recommended way to break up huge objects; just make them their own independent objects.

But, since you said you don't want to do the work to break them up, there's another way you can do this, and that's using a transformer. Using a transformer won't save Raven from loading up the big Account object and all its children, but since the transformer is executed on the server, it won't send the huge object over the network to your web server.

public class AccountWithFirstLevelResourcesTransformer : AbstractTransformerCreationTask<Account>
{
    public AccountWithFirstLevelResourcesTransformer()
    {
        TransformResults = accs => from acc in accs
                                   select new Account
                                   {
                                       ...
                                       Resources = acc.Resources.Select(fullResource => new Resource
                                       {
                                            // Only the properties we want loaded here.
                                            Name = fullResource.Name,
                                            ...
                                       })
                                       ...
                                   };
    }
}

You'll install this transformer during startup:

new AccountWithFirstLevelResourcesTransformer().Execute(RavenStore); // RavenStore is your IDocumentStore singleton.

Then your .Load calls will look like:

// This account will have only the first level resources.
var account = dbSession.Load<AccountWithFirstLevelResourcesTransformer, Account>("accounts/1");

Upvotes: 0

Judah Gabriel Himango
Judah Gabriel Himango

Reputation: 60001

How do we load the Account object, all of its miscellaneous children and objects, and only the first level of Resource objects, and then lazy-load sub-children of Resources? (there are like 7 specific collections that can be lazy-loaded)

It's pretty simple to do load-on-demand with Raven. To do that, make your Resources have the things you want lazy loaded to be their own documents, then just have a collection of IDs on the parent.

Before:

class Resource
{
   public List<Foo> Foos { get; set; }
   public List<Bar> Bars { get; set; }
   // ... etc
}

After:

class Resource
{
   // These are the things we need to lazy load.
   public List<string> FooIds { get; set; }
   public List<string> BarIds { get; set; }
}

As for your Foo and Bar objects (the lazy loaded children of Resource), you'll need to .Store them as their own documents.

Once you do this, loading a Resource won't load all its child objects, giving you perf gains when reading and writing.

But what about when you need to load those children? Use .Include:

// Query for Resource and include the children in a single remote call.
var resourcesWithChildren = docSession
   .Query<Resource>()
   .Include(r => r.FooIds) // Include the related Foos
   .Include(r => r.BarIds) // Include the related Bars
   .Where(...)
   .ToList();


foreach (var resource in resourcesWithChildren)
{
    // Grab the children; they're already loaded, so this won't induce a remote call.
    var foos = docSession.Load<Foo>(resource.FooIds);
    var bars = docSession.Load<Bar>(resource.BarIds);
}

Upvotes: 3

Related Questions