Matthew Morey

I'm an engineer, developer, author, hacker, creator, tinkerer, traveler, snowboarder, surfer, and husband.

I create iOS apps professionally and independently.

Core Data performance is a balance

17 November 2013

This is the first in a series of articles on Core Data performance. Material is based on the talk High Performance Core Data I gave at CocoaConf in Atlanta on November 15th, 2013. An hour talk is not enough time to discuss such an advanced topic so I will be continuing the conversation here.

Like most things, a fast Core Data implementation is a balance. The more objects you load into memory the faster your app will be, but then you are using more memory. You can minimize memory usage, but then your app will be slower.

Increasing speed by using too much memory will cause your app to receive low memory warnings and if it has to, the system will terminate the offending app. You can reduce memory usage by persisting more objects to disk, but then your app will be slower due to frequent slow disk access.

Core Data performance is a balance between memory and speed

An OS X system has much more memory at its disposal. With more objects in memory slow disk access is less frequent which means quicker operations. Currently the cheapest MacBook Air and MacBook Pro come with at least 4 GB of RAM. For most apps, 4 GB of RAM is a limit that should never be hit. Of course there are exceptions, audio, photo, and video heavy apps come to mind.

Core Data performance is a balance between memory and speed

On iOS where you're memory constrained you're forced to load less into memory, resulting in slower operations due to frequent disk access. If you're still supporting iOS 6, your available memory on the iPhone 3GS and iPod touch 4th generation is only 256 MB. On iOS 7 the limiting device is the iPhone 4, iPod touch 5th generation, and iPad 2 with only 512 MB of RAM.

Core Data performance is a balance between memory and speed

Core Data makes it easy to load everything into memory with fetch requests, but you probably shouldn't.

NSFetchRequest *fetchRequest = 
    [NSFetchRequest fetchRequestWithEntityName:@"EntityName"];

If your table view is backed by a fetched results controller Apple gives you free batching, just set a batch size. Doubling the amount of objects that are on screen at the same time is a good starting point.

NSFetchRequest *fetchRequest = 
    [NSFetchRequest fetchRequestWithEntityName:@"EntityName"];

[fetchRequest setFetchBatchSize:20];

In some situations you have to load many or all objects into memory. But once you're done with an object you should release it. Releasing an object, or turning it into a fault, is as easy as calling refreshObject:mergeChanges: with a parameter of NO. Faulting an object clears its in-memory property values thereby reducing its memory footprint.

[context refreshObject:ManagedObject mergeChanges:NO];

If you have multiple objects that need to be turned into faults just perform a reset on the context. This will clear the entire object graph as if you had just created it.

[context reset];

When importing large amounts of data, you should use an efficient find-or-create algorithm. Naive implementations load all previously persisted objects into memory before enumerating over both sets of data. A better approach is to perform the import in batches and purge the memory in between each batch.

Core Data performance is a balance between memory and speed

It doesn't have to be a complicated process, here is a full example of importing data with an efficient find-or-create algorithm and batching.

NSUInteger totalNewThings = [newThingsArray count];
NSInteger totalBatches = totalNewThings / BATCH_SIZE_IMPORT;

// Create array with just the unique keys
NSArray *jsonGUIDArray = [sortedArray valueForKey:@"GUID"];

for (NSInteger batchCounter = 0; batchCounter <= totalBatches; batchCounter++) {

// Create batch range based on batch size
    NSRange range = NSMakeRange(batchCounter*BATCH_SIZE_IMPORT, BATCH_SIZE_IMPORT);
    NSArray *jsonBatchGUIDArray = [jsonGUIDArray subarrayWithRange:range];

    // Grab sorted persisted managed objects, based on the batch
    NSFetchRequest *fetchRequest = [[NSFetchRequest alloc]initWithEntityName:@"EntityName"];
    NSPredicate *fetchPredicate = [NSPredicate predicateWithFormat:@"%K IN %@", 
                                                              @"GUID", jsonBatchGUIDArray];
    // ...

    while (NewThingsDictionary) {
        // ...
        // Efficient find-or-create algorithm
        // ...
    }
}

When working with Core Data you should be mindful that every choice you make is really a trade off between memory and speed. Don’t fetch more than you need, but fetch enough so you don’t keep going back to disk. There is not a single solution to this balancing problem as it's app and data model dependent. Fortunately, we have tools such as Instruments that tell us where we are on the memory and speed continuum.



If you would like to receive an email when the series is done or if I make significant changes to HighPerformanceCoreData.com please subscribe.