Performance improvement in Core Data relationship

Question

I have two core data entities (which have relationship, and its inverse), pre-populated (around 50k registers on each), and I need to make a relationship. It's almost a 1:1 relation. They have an attribute in common, so they must be in a relationship if both attributes are equal.

I'm trying to do it in a rough way, and getting a lot of memory issues (it quickly escalates to memory warnings).

@autoreleasepool {        
    NSFetchRequest *e2sRequest = [[NSFetchRequest alloc] initWithEntityName:@"Entity2"];
    e2sRequest.includesPropertyValues = NO;
    e2sRequest.includesSubentities = NO;
    NSArray *e2s = [self.fatherMOC executeFetchRequest:e2sRequest error:nil];

    if(e2s.count > 0) {
        NSFetchRequest *e1sRequest = [[NSFetchRequest alloc] initWithEntityName:@"Entity1"];
        e1sRequest.includesPropertyValues = NO;
        e1sRequest.includesSubentities = NO;
        NSArray *e1s = [self.fatherMOC executeFetchRequest:e1sRequest error:nil];

        for(Entity1 *e1 in e1s) {
            NSString *attributeInCommon = e1.attributeInCommon;
            NSPredicate *predicate = [NSPredicate predicateWithFormat:@"attributeInCommon = %@", attributeInCommon];
            Entity2 *e2matching = (Entity2 *)[e2s filteredArrayUsingPredicate:predicate].lastObject;
            if(e2) {
                e1.e2 = e2matching;
            }
        }
    }
}

I've tried getting the attribute in common and the objectID in memory in a NSDictionary, with no result. I've tried a couple more of methods, ones being terribly slow, and others being terrible memory-eaters.

I know that I must check the errors, I know I can do it in less lines of code, but think of it as a debug/on a rush code, so I'll be fixed.

Thanks in advance

Wain · Accepted Answer

You're trying to load 100000 items all at the same time so it's no wonder you have memory issues.

You need to batch and if you create an autorelease pool you need to drain it sometimes (so it needs to be involved with the batch).

So, set a fetchBatchSize on the first fetch request. Then, iterate over the results it finds taking fetchBatchSize items at a time. This is where the pool should be so it's released after each batch. Start with a batch of 100 and see how it goes.

Each batch then makes the second query with a predicate to limit to the set of values that can actually match with the current batch.

Then run your matching logic.

Consider also using the Core Data tool in Instruments to check what's happening, how many requests you make to the data store and how long it all takes.

Performance improvement in Core Data relationship

Answers (2)

Related Questions