kelan.io

Synchronized Wrapper in Swift

This is an idea I've been tinkering with recently, to help enforce thread-safe access to properties in a class instance. It still needs some more refinement, but I wanted to write up my thoughts and progress so far, to start to get feedback.

Background

I recently noticed a pattern in my code where I use a dispatch semaphore, or serial queue, to achieve thread-safe access to some critical properties in a class. This works fine, but isn't enforcible at all. A simple headerdoc comment of /// -note: Only access these properties while holding the above lock is way too easy to overlook, even with the best of intentions. So, I was wondering if there was a way to have the Swift compiler help enforce this.

General Approach

The basic idea is to wrap the critical property in something that keeps the value private, and forces you to access them only using specific methods that do the locking/synchronization for you.

Here's the basic definition. We're using a simple DispatchSemaphore to do the locking.
class Synchronized<T> {
    private var value: T
    private let lock = DispatchSemaphore(value: 1)

    init(_ value: T) {
        self.value  = value
    }
}
It only holds a single value, but you can put multiple things in there by putting them all in a struct. And, then you know they will always be updated together.
struct CriticalState {
    let title: String
    var things: [String] = []
    var sublteDerivedValue: Int = 0  // for example, this is the length of the things array
}
let criticalState = Synchronized(CriticalState())
Notice that value is private, so you can't access it directly.
let criticalString = Synchronized("test")
// This doesn't work:
print(criticalString.value)

error: 'value' is inaccessible due to 'private' protection level

So, you have to go through accessor methods.

Here is one that gives you "read access" to the wrapped value. It waits for the semaphore first, and signals it after it's done.

As a side note, defer is nice here for a few reasons. First, it lets us return the output of the block without having to store it in a temporary variable, and second, because it will still signal the semaphore if we exit this scope via the block throwing.

/// The unwrapped value is passed in to the given closure, as a read-only value.  And, you can
/// then calculate some value from it (the <code>R</code>), and return it from the closure, and
/// it's returned from the <code>.use()</code> method.
func use<R>(block: (T) throws -> R) rethrows -> R {
    lock.wait()
    defer { lock.signal() }
    return try block(value)
}
Here is how it looks to use it.
let criticalString = Synchronized("test")
let uppercasedString = criticalString.use { string in
  return string.uppercased()
}
print(uppercasedString)


TEST
What about modifying the value? Let's add another method for that.
/// This method lets you pass a closure that takes the old value as an `inout` argument, so
/// you can use that when determining the new value (which you set by just mutating that
/// closure parameter.
/// - note: The lock is held during the whole execution of the closure.
func update(block: (inout T) throws -> Void) rethrows {
    lock.wait()
    defer { lock.signal() }
    try block(&value)
}
Which we use like:
let criticalString = Synchronized("test")
criticalString.update { string in
  string = "new value"
}

So, that's the basics. It's pretty simple, really. But, there are some subtle details and limitations to explore a bit more.

Limitation with Reference Types

Astute readers might notice that this whole approach has a fundamental flaw if the wrapped value is a reference type. A caller could keep a reference to the wrapped value, and mess with it outside of the closure.

Unfortunately, I don't know of a great solution for this. If you have any ideas, please let me know! But, even so, I still think that this whole concept has some value, because it's at least one step better to encouraging proper synchronization. You have to actively subvert it to break things, rather than simply forget to hold a lock at the write time (pun intended — sorry).

A simple example of this potential for abuse would be to just return the value from the .use() block.
class CriticalState {  }
let criticalState = Synchronized<CriticalState>()
let unsafeReference = criticalState.use { $0 }
unsafeReference.mutate()  // <-- This is bad!
So, I actually think it's worth addressing this head-on, and making a method for this, but naming it with an appropriate warning.
/// - note: If the wrapped type a reference type, you shouldn't use the return
/// value it to modify it, because that won't be synchronized after this methods returns.
func unsafeGet() -> T {
    lock.wait()
    defer { lock.signal() }
    return value
}

However, this operation is actually "safe" for value types, because the caller just gets a copy of the value, so it can't then affect the thing inside the wrapper. So, I would like to make two different "flavors" of this Synchronized struct: one for value types, and one for reference types. But I'm not sure if it's actually possible to express with Swift's generics. Again, if you have thoughts/feedback, let me know!

Further Questions and Issues

Locking

Using a DispatchSemaphore as the lock is not necessarily the best choice (but was a simple choice to illustrate the general pattern here). For further reading, start with this thread on Twitter. The major issue is priority inversion (like OSSpinLock) because, as Steve Weller points out, "Queues execute the waiters in order. Locks/semaphores use an unknown/unpredictable method to pick next", suggesting that a serial DispatchQueue would be better.

But both of those lock all access, even preventing two concurrent readers, which seems like it should be supported. So, something like a pthread_rwlock_t, can allow multiple readers as long as nothing is writing.

In a future post, I'll explore different locking implementations, with a way to easily swap them out on a case-by-case basis.

Naming Options

I'm not sure what name is best (for both the whole class, and the accessor methods). Some ideas are, for the whole class: class ThreadSafe, class Critical, class Protected, for the use() method: func read(), func with(), and for the update() method: func write(), func mutate().

Since it's most important to consider how they look/feel at the call site, here are a few examples:

Here's the setup
struct State {
    var statusMessage: String
    var items: [Item]
}
let state = Synchronized<State>(State())
Accessing with "read" and "write"
let message = state.read { $0.statusMessage }
state.write { state in
    state.items.append(Item())
}
Accessing with "use" and "update"
let message = state.use { $0.statusMessage }
state.update { state in
    state.items.append(Item())
}

Again, I'm open to suggestions, so let me know what you think!

Thanks

Thanks to Jacob, James, and the rest of the folks at Upthere for giving me early feedback about this idea during a discussion over lunch. And preemptively to anybody who has any feedback!