little_endian
little_endian

Reputation: 69

Trying to optimize some looping code

I have to NSArrays:

~~~~~

EDIT2 - added:

both my arrays get filled by parsing separate csv files, so initially I have no clue about any relationship between customers and orders.

~~~~~

For every customer I try to determine its orders, more specifically its last order (date).

I estimate that more than half of my customers do not have any orders, some have several, others a lot.

At the very beginning I had 2 nested loops, the outer one iterating over all customers, the inner one iterating over all orders. Which ended up in more than (3000 * 8000) comparisons (confer the attached code).

After some musing I realized that I have only valid orders, i.e. every order has a customer id and for every customer id i have an existing customer with the same id. To reduce the overhead of the inner loop I ordered both my arrays according to their customer ids.

That implies that the first order(s) correspond to my first customers. E.g:

Each corresponding order is then collected in an array until I reach an order, whose customer id does not correspond to my customer's id. I then exit (break) my loop, remove my collected orders from the array containing all the orders (_bestellungenMutArr) and proceed with the next customer.

Removal of the objects from the array is quite fast, as the objects are ALL at the beginning of the big array. (also see graphs stating the performance of the different array operations here at ridiculousfish.

Inspecting the Instrument's time profiler data, I found out that over 99% of the time is spent on the removal of the objects. Instruments output: Instrument's output for my "removeObjectsInArray"-approach

I then came up with the idea to take advantage of the enumerateObjectsUsingBlock's index. Instead of using fast enumeration for the inner loop, I'm using block enumerator. To achieve the same reduction of overhead in my inner loop (i.e. never process an order twice I keep track of the index which I later use to have an offset for the next iteration (for the next customer). This way I circumvent the removal of objects from an array, which I thought might quite a nifty idea.

Inspecting the time profiler output it turned out it wasn't: enter image description here

So using the variant removing objects from array by using removeObjectsInArray method (for about 1500 times) is about 8 times faster, than simply keeping track of an index?

Is this to be expected or am I missing something?

The array removal/fast enumeration variant:

- (void) determineLastOrders
{
    for (Kunde * kunde in _kundenArray)
    {
        NSMutableArray *bestellungenToRemove = [[NSMutableArray alloc] init];

        /* go through all (remaining) orders (after the loop the matching will be removed) and determine the next ones to remove */
        for (Bestellung * bestellung in _bestellungenMutArr)
        {
            if ([[bestellung bestKdNr] isEqualToString:kunde.kdnr])
            {
                if ( kunde.lastOrder == nil)   
                {
                    kunde.lastOrder = _orangeDate;  //"init"
                }
                else if ([kunde.lastOrder compare:[bestellung bestDatum]] == NSOrderedAscending)
                {
                    kunde.lastOrder = [bestellung bestDatum];
                }
                //As this Bestellung already has had a date comparison (equal by kdnr)
                //we won't need to visit it again by our next customer
                [bestellungenToRemove addObject:bestellung];
            }
            else
            {   //as all orders are ordered by the customer id we can abort iteration
                //after we went past the current id
                break;
            }
        }
        [_bestellungenMutArr removeObjectsInArray: bestellungenToRemove];
    }
}

and the checking index / block enumeration variant:

- (void) determineLastOrders
{
    __block NSUInteger bestIndex = 0;
    for (Kunde * __block kunde in _kundenArray)
    {
        /* go through all (remaining) orders (after the loop the matching will be removed) and determine the next ones to remove */
        [_bestellungenMutArr enumerateObjectsUsingBlock: ^(Bestellung * bestellung, NSUInteger idx, BOOL *stop)
         {
            if (idx >= (bestIndex))
            {
                if ([[bestellung bestKdNr] isEqualToString:kunde.kdnr])
                {
                    if ( kunde.lastOrder == nil)
                    {
                        kunde.lastOrder = _orangeDate;  //"init"
                    }
                    else if ([kunde.lastOrder compare:[bestellung bestDatum]] == NSOrderedAscending)
                    {
                        kunde.lastOrder = [bestellung bestDatum];
                    }                
                }
                else
                {   //as all orders are ordered by the customer id we can abort iteration
                    //after we went past the current id
                    bestIndex = idx+1;
                    *stop = YES;
                }
            }
         }];
    }
}

Thanks in advance!

EDIT: One other question just came to my mind. Currently - in my first code snippet, I always call the removeObjectsInArray method after each inner loop. If a customer has no orders, I remove an empty array (i.e. trying to remove nil?). My guess is, that the method exits is instruction to do removal if an empty array is passed, so this is more efficient than checking my small array for contents every loop. Or am I wrong?

Upvotes: 1

Views: 75

Answers (2)

maddy
maddy

Reputation: 4111

one more level of optization:

int count = [_bestellungenMutArr count];
for (NSUInteger i = bestIndex; i < count; i++)

why?

now it will not loop through [_bestellungenMutArr count] every time.

Upvotes: 0

Jesse Rusak
Jesse Rusak

Reputation: 57168

Your second example is better, but you're still enumerating through more orders than you need to for each customer, since enumerateObjectsUsingBlock:... starts at the beginning each time. (Unlike in your first code sample, where the array of orders shrinks for each customer.) Try using enumerateObjectsAtIndexes:... instead, passing an index set made with an NSRange beginning with bestIndex.

Or, you could just use a normal for loop: for (NSUInteger i = bestIndex; i < [_bestellungenMutArr count]; i++) which would probably be faster.

Upvotes: 1

Related Questions