Building a Better Ant Hill

Recently, I was tasked with answering the following question (actually two questions, but we’ll get to the second one at the end):

Is this:

@import Ant.Ant000.h;

going to compile faster than this:

@import Ant;

Restated more verbosely: in this era of modules, is it faster to import only the individual files you need from an module umbrella header, or does it make no difference, so you can rely on the simplicity of always importing the entire module in all your files?

I had always assumed the latter, but now I was being asked to prove it.

To do that, I made a new GitHub project, Import-Ant. Inside of it, you’ll find five Xcode projects: four test projects and a test builder project.

You may ask: why bother with a builder project? What do you need to build to conduct these sorts of tests?

Turns out, about 40,800 files.

I didn’t want differences between the two techniques listed above to get lost in the noise of a normal build, so I decided that my Ant framework (the thing to be imported) would have 100 header files — and a corresponding 100 source files — and my Hill iOS app (the thing doing the importing) would have quite a few more — 5,000 source files, each of whom would import one Ant header file.

To avoid having to make either those 200 header/source files, or those 10,000 header/source files, by hand, I wrote some code to do it for me, which resulted in the Builder project. There’s the AntBuilder class to make the Ant framework files, and the HillBuilder class to make the Hill app files.

Currently, there are four test projects that the Builder project will make files for:

  • 01 Import By Module
  • 02 Import Individually
  • 03 Import by Framework
  • 04 Import by File

The first two test projects address the problem described at the beginning of this post.

The second two test projects go more old school, converting the new module syntax back to straight-up C import syntax. Individual files:

#import <Ant/Ant000.h>

versus the umbrella header:

#import <Ant/Ant.h>

So instead of just one Ant and Hill project pair, there are four of them.

To test build times, I would reboot each time, open all four projects, wait for them to finish indexing, and then build one of them. After writing down that build time, I would clean that project’s build folder, then go back and start the cycle over again….

I built for Debug to keep it simple and I used the default Simulator target that came up when opening the project, either the iPhone 8 or the iPhone 8 Plus.

This command-line invocation helped:

defaults write com.apple.dt.Xcode ShowBuildOperationDuration -bool YES

It makes Xcode show the most recent build time in its user interface, like so:

Screenshot of portion of Xcode main window show build result 'Succeeded' with extra section '120.774s'

Here are average times:

01 Import By Module: 124.796s
02 Import Individually: 121.823s
03 Import by Framework: 126.342s
04 Import by File: 122.121s

The differences were between 0% and 4%, which I don’t find to be all that significant, for two reasons.

For one, I only built each project 3 times, and each test series had its own outliers. I suspect if I’d had the patience to build them 10 times, the differences would have smoothed out more. I’ve also since realized Xcode may take up significant amounts of CPU time even after its UI indicates that indexing has finished, lending more randomness to the proceedings.

For two, I actually built the first two projects 3X each separately before building all four projects for this post, and in that case, 01 Import By Module was faster than 02 Import Individually by 2%.

If you’re not convinced, you can certainly run them for yourself.

But for me, I think this proves there isn’t a significant penalty for using full module imports instead of trying to pick out individual module files to import.

The second question was whether this syntax influenced which files would be rebuilt if an Ant framework header was modified. Now, every individual Ant class is used by 50 Hill classes. If only, say, Ant000.h was modified, and only 50 Hill source files referenced it directly, would only those source files be rebuilt?

Turns out no. In all four test cases, two of which involved only references to specific Ant headers, the entirety of the Hill project was rebuilt if even only one Ant header was modified. Rebuild the module (in this case, the Ant framework), and everything that relies on any part of that module is also rebuilt by the current version of Xcode.

Sound right? I consider myself far from an expert in this area, so if anyone has any more information, feel free to leave a comment or ping me on Twitter. Thanks!

Restoring Transience

While doing some Core Data research, I came across my old GitHub project (from this post) demonstrating transient attributes.

I decided to update my project to current coding and Core Data practices, as an exercise, and I discovered a couple interesting, if minor, points.

1. Managed Object Context Uses Weak References

The whole purpose of the project was that, if I tried to fetch the same objects in two different Core Data contexts, the transient attributes wouldn’t be preserved.

But now, I found that even doing the same fetch in the same context would return different Objective-C objects, and thus would not preserve the transient attributes for any objects that I had made previously. What had changed? What was going on?
Transient app window showing three rows, with two having nil name attributes, and only the third having a non-nil name

What had changed, as far as I can see, is that Core Data is far more aggressive in deleting in-memory objects that don’t have any references to them except the context. Since my original project was doing a fetch every time it wanted the list of objects, and keeping no permanent reference to them, that meant that every object except the most recent one was going away and being recreated, and thus their transient attributes were not being preserved.

I’ve changed the project to keep its own list of the objects it has created so far, so they’ll stick around until I click the “Refresh” button.

This also means that I don’t need multiple contexts. I can just nil out my own list (and call reset on the context to be sure), and I’ll get new model object instances for my next fetch. This means that I can update my code to use the new NSPersistentContainer class and its main-thread-only viewContext for all my work, without worrying about maintaining multiple main-thread contexts myself.

2. There’s a Trick to Editing a Managed Object Model at Runtime

In my original project, the model was set to not use a transient attribute. If you wanted to test transient attributes for yourself, you had to go in and manually change the model file in my project, rebuild, and run it again.

This time around, I decided to do better.

So while I still left that attribute as non-transient on disk, I added some code to edit the model in memory before it is used, and tied that value to a new checkbox in the user interface. This, the comments in NSManagedObjectModel assure me, is totally allowed and supported.

Transient app window showing a new checkbox on the right labeled 'Transient'

Now, if you toggle that checkbox (which deletes the current list contents), you’ll change the behavior to either use a transient name attribute (so that refreshes will nil out the names) or a non-transient name attribute (so that refreshes won’t nil out the names).

The trick? The instance of the model you load from disk can’t be edited at all, even before its use in a persistent container. You have to make a copy of it.

3. In-Memory Stores Can’t Be Transferred

My original project used an on-disk persistent data store, but deleted it every time the app started up.

This time around, instead, I used an in-memory persistent data store, which resets itself on every restart with no muss, no fuss. (This is also very useful for unit tests.)

Now above, I said that if you toggle the “Transient” checkbox, all the current database contents are deleted, right? That’s because I have to throw away the current model, and make a new one with the transient attribute handled in a different way.

If I were using an on-disk persistent store, I could just reload the contents from disk using that new model.

But since I’m using an in-memory persistent store, there’s no on-disk backup to turn to.

And the APIs that Apple provides in NSPersistentStoreCoordinator, as far as I can see, do not allow you to detach an existing store from a coordinator and re-attach it to a new coordinator. It always assumes you can reload the store contents from a file on disk, which makes a new store object.

I’ve long believed that, even though Apple tends to say Core Data is an object management framework independent of storage mechanism, that’s just hogwash. No company I’ve ever worked at uses Core Data for anything serious without backing it with a SQLite database, and all of Core Data’s heavy-duty features are geared towards that configuration.

Here, as we can see, even their APIs favor one kind of store over another. Which is as it should be! But I wish they’d stop pretending.