Scary Sharks and Custom CodingKeys

I had to handle a fun little challenge with Codable and unorthodox JSON recently (as you do).

Apple’s Codable API have been around for a while, and it’s an example of the best kind of API: it makes the easy things easy, and the hard things possible.

For example, let’s say I’d like to load in some JSON data about sharks in movies:

[
    {
	"type": "Great White Shark",
	"movie": "Jaws",
        "length": 15
    },
    {
        "type": "Megalodon",
        "movie": "The Meg",
        "length": 50
    },
    {
        "type": "Mako Shark",
        "movie": "Deep Blue Sea",
        "length": 14
    }
]

Make a struct with the same fields, let loose your decoder, and you’re done!

struct MovieShark: Codable {
    let type: String
    let movie: String
    let length: Float
}

let sharks = try JSONDecoder().decode([MovieShark].self, from: data)

(Note: there are a few more steps to get running code, but you get the gist.)

If you need to fiddle with the keys a bit, say, because your data uses shark_type instead of just type, you add a CodingKeys entry:

    enum CodingKeys: String, CodingKey {
        case type = "shark_type"
        case movie
        case length
    }

I’ve tried (admittedly not that hard) to figure how how CodingKeys works.

  • How can defining a nested enum of a particular name cause this behavior?
  • How does the Swift enum type automatically conform to CodingKey?

Googling didn’t turn up any details on this. And, I mean, it doesn’t matter, right?

Well, sometimes it does.

The challenge I was facing was that my data wasn’t in the format shown above, but rather in this format:

[
    {
        "Great White Shark" : {
            "movie": "Jaws",
            "length": 15
	}
    },
    {
        "Megalodon": {
            "movie": "The Meg",
            "length": 50
        }
    },
    {
        "Mako Shark": {
            "movie": "Deep Blue Sea",
            "length": 14
        }
    }
]

The simple Codable approach can’t handle top-level dynamic keys like this. My question was, can I use Codable to do this at all?

And the answer is yes.

Turns out, CodingKeys doesn’t have to be an enum.

You can implement this protocol with a struct that has all its requirements: a string property and initializer, and an integer property and initializer.

Once you’ve done that, of course, you don’t have the static keys you still need, so I stashed those in a separate, unrelated enum called OtherCodingKeys.

struct MovieShark {
    let type: String
    let movie: String
    let length: Float
    
    struct CodingKeys: CodingKey {
        var stringValue: String
        init(stringValue: String) {
            self.stringValue = stringValue
        }

        var intValue: Int? { return nil }
        init?(intValue: Int) { return nil }
    }

    enum OtherCodingKeys {
        case movie
        case length
    }
}

You’ll notice MovieShark no longer declares itself as implementing Codable here. That’s because I need to implement custom versions of Encodable and Decodable separately.

First, Encodable:

private struct MovieSharkContents: Codable {
    let movie: String
    let length: Float
}

extension MovieShark: Encodable {
    func encode(to coder: Encoder) throws {
        var container = coder.container(keyedBy: CodingKeys.self)
        try container.encode(MovieSharkContents(movie: movie, length: length), forKey: CodingKeys(stringValue: type))
    }
}

There are two steps:

  • Get the top-level container of the encoder with coder.container(keyedBy: CodingKeys.self). This is the standard way to start a custom encoding.
  • Specify a dynamic key by using the CodingKeys string initializer, CodingKeys(stringValue: type). You can’t just specify a random string, because that won’t be of the correct type.

Note, because the values of the subsequent dictionary are heterogenous, and Swift can’t serialize a [String: Any] type, I had to make an intermediate type, MovieSharkContents, to represent it.

Next, Decodable:

enum MovieSharkError: Error {
    case unableToDecode
}

extension MovieShark: Decodable {
     init(from coder: Decoder) throws {
         let container = try coder.container(keyedBy: CodingKeys.self)
         for key in container.allKeys {
             type = key.stringValue
             let contents = try container.decode(MovieSharkContents.self, forKey: key)
             movie = contents.movie
             length = contents.length
             return
         }
         throw MovieSharkError.unableToDecode
     }
}

Here, since the top-level key has an unknown name, I iterate through all the top-level keys, and pick the first.

I transfer that top-level key to the type property, and the dictionary values inside it to the other properties of MovieShark. If I don’t find any top-level key at all, I throw an exception.

In this way, I can keep a straightforward MovieShark struct with all the properties I expect, but also handle both loading and saving its custom JSON.

I figured out how to do this, by the way, from the very helpful Flight School Guide to Swift Codable. I still don’t get all the Swift magic behind CodingKeys, but I know a little more about how to use it!

Point of No Return

I found an interesting (to me) aspect of Swift/Objective-C interactions this week.

Take this Objective-C method:

+ (nullable NSData *)dataWithString:(nullable NSString *)string error:(NSError **)error

It uses the standard Apple pattern of having both a return value and an error. (I left out the error’s nullability annotations for brevity, as Apple always assumes them.)

In theory — and, if I’m remembering correctly, according to Apple guidelines — first, you’re supposed to check if the return value is invalid. Only once you’ve verified that it’s invalid should you check to see if there’s an error.

And as far as I’m been aware, there’s never been any assumption that you’ll get an error. That’s why, throughout your Objective-C code, you always have to check the return value and treat that as gospel.

If you use this method in Swift, the auto-generated Swift signature is:

func data(with string: String?) throws -> Data

Notice something?

I mean, besides the fact that Apple’s compiler/runtime magic smoothly converts between the Objective-C’s last-parameter-is-an-error-pointer pattern and Swift’s “throws” pattern.

The return type doesn’t allow for nil anymore.

You can’t check for an invalid value, if “invalid” means nil.

Instead, you can only assume that the original Objective-C implementation will “throw” an error if there is a problem.

Now, go back to your original Objective-C method. What if you return nil but don’t set the error? What does Swift do?

It does something clever.

In my testing, even when you haven’t set an error, the Swift translation layer throws an error anyway.

If you log it, it’s called nilError.

It’s got a domain of Foundation._GenericObjCError and a code of 0.

Feels a bit like a hack, doesn’t it?

But it does prevent the problem of old Objective-C code not indicating the desired result under Swift.

Translating Objective-C to Swift in Xcode 9.0 Beta 2

I’m putting together a post comparing Mac drag and drop APIs and iOS drag and drop APIs.

To prepare, I took the Xcode CocoaDragAndDrop sample project (here, last modified in 2011 with note “Updated for Xcode 4”) and converted it to Swift (here) using the second beta of Xcode 9.0.1

Since I haven’t internalized the pattern between Objective-C and Swift method conversions, I was often frustrated by how to translate Objective-C method calls to Swift method calls.

While I was working on the project, it seemed that 4 times out of 5, when I tried to go to a class or protocol’s declaration in Apple’s headers and see its Swift-ified methods, Xcode would take me to the Objective-C header instead, even though I was starting off in a Swift file.

Of course, now that I’m trying to reproduce it to file a Radar, it doesn’t happen. I wonder if that’s because the final project has no Objective-C files in it at all.

It doesn’t help that the translations changed between Swift 3 and Swift 4.

For example, NSPasteboardTypeTIFF in Swift 3 is now NSPasteboard.PasteboardType.tiff in Swift 4, with a similar pattern for all its friends.

register(forDraggedTypes newTypes: [String]) is now registerForDraggedTypes(_ newTypes: [NSPasteboard.PasteboardType]).

Etc.

It’ll be nice to be working exclusively in Swift for the rest of this effort.


1. Feedback welcome! ↩︎

Boilerplate in C++ and Swift

Moving from C++ to Objective-C was a revelation to me.

In C++, dynamic lookup was a chore. Because the language was relatively static, if you wanted to go from an arbitrary key to code, you had to write your own custom lookup table.

I remember writing a lot of registration code, a lot of boilerplate. For each class or method I wanted to look up, I would put an entry in the lookup table. Maybe it was part of an explicit factory method, maybe it was a C macro, maybe it was some sort of template metaprogramming magic. But there had to be something, and you had to write it every time.

Boilerplate, boilerplate, boilerplate. Over and over again.

In Objective-C, the dynamic lookup mechanism was built into the language: dynamic dispatch. Look up any class, any method, with just a string.

I remember reading somewhere — I wish I remember where — a post where someone pointed this out, that the C++ technique and the Objective-C technique both required lookup tables, but in the latter case, it was maintained for you by the Objective-C runtime. Objective-C didn’t reduce the inherent complexity, it just hid it, made it uniform.

The LLVM team, on the other hand, has been trying to kill dynamic dispatch for a long time.

Since ARC, calling arbitrary methods by string, the core of dynamic dispatch, by default triggers a warning.

And of course, in pure Swift, dynamic dispatch is completely absent. Everything must be known ahead of time by the compiler.

I understand why. They want to make it more safe.

Does that mean we’re seeing the reinvention of the custom lookup table in Swift?

Swift enumerations, for example, make this relatively easy, since you can pair methods, i.e. arbitrary code, with each enumeration case.

If I have to write a new enum case for every new class, though, then I consider that unnecessary boilerplate, a throwback to C++ techniques. Boilerplate.

And I wish we didn’t have to go back down that road.

Swift on the Server, Part 1

I’m not convinced Swift is going to be a long-term hit in server software.

The big push I’ve heard about is from IBM. In this recent talk, Chris Bailey gives some reasons to use Swift on the server:

  • It’s faster and uses less memory than some other technologies.
  • It has the potential to reduce communication errors when used for both the client and the server. Chris mentions the Mars Climate Orbiter as an example of such an error.

I personally don’t find these arguments compelling.

First of all, plenty of extremely popular technologies are not the most performant technologies. You choose them because they’re easier to develop in, easier to maintain, easier to keep up and running. If we wanted the very fastest, we’d still be writing server software in C.

Second, most current server software is written in a different language, and with different libraries, than the client software it talks to. People know how to solve this problem. Hint: switching to a new language isn’t necessary.

Third, native iPhone and Mac apps are an important but not overwhelming subset of the clients a server has to talk to. The Swift advantage vanishes if we’re talking about Android or Windows or web clients.

So is Swift going to be easier to develop in, easier to maintain, and easier to keep up and running than its competitors on the server?

Making it those things for server software is certainly not Apple’s priority. Their goal is to make it work for them, which means low-level OS software, frameworks, and native application development.

IBM can try to do this work. Chris’s talk is all about the extra steps they’ve taken, the extra projects they’ve written, to do just that.

But at some point, as part of their effort, IBM is going to want something from Apple, something from the Swift development effort, which clashes with what Apple thinks is important.

Who’s going to win that clash?

Boom Boom Enum

To quote a fairly awful movie: “We were so, so wrong.” — me

Remember I said you couldn’t use Swift value types for a linked list? The real reason is because you can’t have references to other value instances, just copies (thus making bi-directional linked lists a recursion nightmare), but the compiler error was “a value type can’t refer to itself”.

Turns out, one Swift value type can refer to itself: enumerations. By adding the indirect keyword, you can use Swift enums to, for example, represent a binary tree structure:

indirect enum Tree {
	 case node(Int, left: Tree?, right: Tree?)
}

And it works! But is it a good idea? Hell, no!

Why not? Because accessing the “properties” of your data structure is a pain in the ass:

switch node {
case let .node (data, left, right):
	// Do something with data, left, and right
}

As far as I know, you need an otherwise extraneous switch statement, a case statement, and an extra set of local variables just to access the values. (Whereas for a class all you need is dot syntax.) And you need to do that everywhere you want to access them, every method.

And the compiler will complain if you don’t use all of the enumeration values, so you have to remember to use _ for those:

switch node {
case let .node (_, left, _):
	// Do something with left
}

I tried writing a full tree traversal implementation in Swift with enum-based trees and it was an unholy mess that I would not repeat.

Learn from me. Don’t use cool language features at the expense of maintainability.

Link or Swim

I learned today that you can’t make a linked list in Swift using value types. The reason why ties into the pointers issue I was recently discussing.

Here’s how a linked list struct might look in C:

struct LinkedList {
	struct LinkedList *next;
	int data;
};

And here’s how you would do it in Swift:

struct LinkedList {
	var next: LinkedList?
	var data: Int
}

The difference is, in C, you can have a reference to a struct without making a copy — without it being the thing itself. By adding an asterisk and making it a pointer.

In Swift, you can’t do that. So a reference to another struct might as well be that other struct, even if it isn’t always, under the hood.

If I try to compile that Swift, I get the error “value type ‘LinkedList’ cannot have a stored property that references itself”.

If I change the declaration from struct to class, then it compiles fine, because the property representing the “next” instance is now a reference to it.

I wasn’t expecting this issue to come up again so soon in my work.

Entirely Missing the Pointer

Whenever I read about Swift, I read about the distinction between reference types and value types.

In the C-based languages I used (and still use), I never thought about it like that. Instead, I thought in terms of pointers1 and everything else, which here I’ll call non-pointers.

You could have a pointer to anything, but in Objective-C they are used especially for class instances. And you could have a “non-pointer” to anything, including scalars, structs, and (in C++) class instances.

And it’s always been visually easy to distinguish between the two: one has an asterisk, and one doesn’t.2

Check it out. The ones with the asterisks have reference-based semantics, and the ones without the asterisks have value-based semantics:

Reference:

@interface Foo : NSObject
@property int bar;
@end

@implementation Foo
@end

struct Bar {
	int foo;
};

void referenceTest() {
	int intStorage = 0;
	int *myInt1 = &intStorage;
	int *myInt2 = myInt1;
	*myInt1 = 10;
	printf("%d\n", *myInt2); // Result: 10, same as myInt1
	
	struct Bar barStorage = { 0 };
	struct Bar *bar1 = &barStorage;
	struct Bar *bar2 = bar1;
	bar1->foo = 10;
	printf("%d\n", bar2->foo); // Result: 10, same as bar1
	
	Foo *foo1 = [Foo new];
	Foo *foo2 = foo1;
	foo1.bar = 10;
	printf("%d\n", foo2.bar); // Result: 10, same as foo1
}

Value:

// Same declarations/definitions as above

void valueTest() {
	int myInt1 = 0;
	int myInt2 = myInt1;
	myInt1 = 10;
	printf("%d\n", myInt2); // Result: 0, not same as myInt1
	
	struct Bar bar1 = { 0 };
	struct Bar bar2 = bar1;
	bar1.foo = 10;
	printf("%d\n", bar2.foo); // Result: 0, not same as bar1
}

Note in Objective-C we can’t have a value-based version of the class. (Though we could in C++.)

Swift has a completely different philosophy. Reference vs. value isn’t syntax-based, it’s identity-based. The exact same syntax will produce different results depending on the original definition of what you’re working on.

Reference:

class Foo {
	var bar: Int = 0
}

var foo1 = Foo()
var foo2 = foo1

foo1.bar = 10
foo2.bar // Result: 10, same as foo1

Value:

struct Foo {
	var bar: Int = 0
}

var foo1 = Foo()
var foo2 = foo1

foo1.bar = 10
foo2.bar // Result: 0, not same as foo1

The only difference in the two samples above is the class vs. struct keyword.

Because that’s such a stark philosophical gap, and for me, an unexamined one, it was quite hard for me to get my mind around it at first.

Notes:

  • For me, Java was the first mainstream language that removed the asterisk for reference types and stopped calling them “pointers”. Though amusingly, they still have a java.lang.NullPointerException, which I’ve always assumed has to be confusing to newbies!
  • Swift further muddies my concept of reference-as-pointer and value-as-non-pointer by allowing multiple value type instances to actually point to the same memory as long as you don’t modify their contents, which C-based value types never did. So Swift value types can now actually be implemented by C-style pointers under the hood.
  • For Objective-C users, our first taste of this kind of thing was with blocks. A newly-created block is kind of a “value” type, created on the stack like all other non-pointer C types. But then you copy it, and it becomes a kind of “reference” type you can pass around outside of the function scope. Same syntax for either type, just like Swift.

1. Here, I define “pointer” as an explicit C reference to memory, detached from its management. Non-pointers are still referencing memory locations, but the runtime manages their creation and destruction as part of something larger: the stack, a class or struct, etc. ↩︎
2. C++ muddied this distinction a bit by introducing references, though they had the decency to give it a difference punctuation mark. ↩︎