Atomic Attributes in Local-First Sync

When we set out to build device-sync in Muse, we had a very strict requirement at the outset: it should be local-first.

Most apps work by storing all of your data on the server, and send only little bits of it to each device as you use it. Then, if a document is modified on two devices at once causing a conflict, the server is the mediator to choose the winner.

With local-first sync, the world gets much more complicated. Every device you use contains a full copy of your data. If you edit the same document from multiple devices, there is no server to mediate which device wins. Instead, both devices get a full copy of each other’s changes, and they need to independently resolve to the same state.

The Sync Protocol

When building the sync framework for Muse, we not only want to support Muse’s current feature set, but also support any future features we may build in the future. Ideally, implementing new features in Muse won’t also require new development on the sync protocol.

Sync is hard. There’s no way around it, it’s a tough problem to solve, and so an important priority for us is finding ways we can simplify the problem. And our final solution is a bit counter-intuitive in this regard: we decided to build a sync protocol that is application agnostic.

It sounds like building sync for only Muse would be simpler than building a general sync engine, but the opposite is true. Thinking about the general case has forced us to develop a simple protocol based around tiny atomic changes – which we’ll get to soon.

Importantly, this lets us iterate on the sync protocol without making any changes to Muse logic, and also allows for us to build new Muse features without needing to add to the sync protocol at all. This decoupling helps simplify development dramatically, as any engineer working on Muse features doesn’t need to know the intricacies of the sync engine, and any engineer working on sync doesn’t need to know the details of a particular Muse feature implementation. This reduces complexity and reduces bugs.

Last-Write-Wins

Since the server won’t mediate which device’s changes win in the case of a conflict, this means that each device will need to resolve any conflict itself. Since all devices might not be online at the same time, it’s important for all devices to resolve any given conflict to the exact same resolution. We’ve decided to use the very simple last-write-wins method. In the case of a conflict, whichever device made the edit most recently wins and its edits will be used.

We use a hybrid logical clock to determine the order of events between devices. These are built from the device’s wall clock, a counter, and the device id. We also considered a vector clock, which has some benefits over a HLC. specifically, it can differentiate between edits that were made sequentially and edits that are made in parallel. In the case of a conflict, it would give us the option to ask the user to choose between the two conflicting paths.

We decided on HLCs for a few reasons: vector clocks grow in size over time, while HLCs can be a fixed size. Our edits will be so small, that asking the user to choose between conflicts is never an interaction we’d want to build into Muse, so that benefit as marginal at best.

Another option would have been to use something like Operational Transforms. This method requires tight coupling to the application data structure itself. The number of transforms can grow enormous for large data structures, and these are significantly more complicated to reason about compared to last-write-wins.

For the above reasons, we decided to use fixed-size HLCs, small data changes, and a last-write-wins methodology. This gives us an application agnostic sync strategy that’s very easy to reason about.

Our final strategy was inspired by the work many of the Muse team did while at Ink-and-Switch, James Long’s work on Actual Budget app, and the Datomic distributed database.

Atomic Changes

In order to choose the most recent write to win, we needed a definition of what a actually composes a write. We wanted each distinct write to be as small as possible, so that any conflicts that arise will affect the smallest piece of the data-structure. we want the inevitable conflicts to be small and self-healing.

The smallest change to the data-model would be to change the value of a single attribute of a single object, so at a minimum our change tracking should show:

Object IdAttributeValueClock
UUID object identifierString attribute nameInt, Double, String, or BinaryHybrid Logical Clock

We also need to know which device sent the changes. If a device is lost or stolen, we’d like the ability to ignore that devices changes after some given clock value. We also plan for encryption, so we’d need to know which encryption keys to encode/decode for a given message, as each device would have its own key-pair.

The ability to add sharing and collaboration is on our minds as well. We want to provide an easy way for a user to share some portion of their data with someone else, without needing to share their entire corpus or share every application object individually. So we introduced the concept of Scopes. A Scope holds any number of objects, and in practice is mapped 1:1 to a Muse document, and is defined by a UUID. This also means that each object in the database is now addressed by both its Scope Id and Object Id.

So our final model for an atomic change is:

Scope IdObject IdAttributeValueClockDevice Id
UUIDUUIDString nameInt, Double, String, or BinaryHLCUUID

Each row in the database is called an Atom, and is the smallest atomic change that can be made in the sync engine. Each device sends its list of Atoms to other devices, and the most recent change per scopeId, objectId, and attribute define the value for that property.

Mapping the Sync-Schema to Swift

Pre-sync Muse is built on top of CoreData. This gave us numerous advantages, first being that the Sqlite schema used by CoreData is automatically translated to the Swift class. We don’t need to manually query Sqlite and populate objects ourselves, all of that database logic is handled under-the-hood by CoreData.

Property wrappers in Swift let us do something similar for our own sync engine. Below is an example class definition for a sync-backed object type.

class Board {
    let scopeId: UUID
    let objectId: UUID
    @Merged var title: String
}

class Card {
    let scopeId: UUID
    let objectId: UUID
    @Merged var position: CGPoint
    @Merged var size: CGSize
    @Merged var board: Board
}

The @Merged property wrapper automatically saves/loads from the sqlite database for its property. The @Merged wrapper requires that the type of the property implement a protocol to facilitate that transformation – the property should be translated to/from an Int, Double, String, or opaque binary data. For the String title, if device A at time 1 makes the following change: myBoard.title = "some value" , that will generate the sync change below:

Scope IdObject IdAttributeValueClockDevice Id
scopeIdobjectId“title”“some value”1A
I’m using simple integers for the clock in the example, but in practice this is a hybrid logical clock.

This change arrives in Sqlite as a row in the database. For cards, their position and size properties are more complicated than the board’s title property, as they contain 2 Doubles instead of 1 String. While we could store this structure in a single change inside a binary blob, it’s much easier and clearer to store each subpart separately.

To support such a world, some Swift types such as CGPoint and CGSize are composed of two CGFloat atomic changes. So setting myCard.size = CGSize(200, 200) will generate the following two Atoms:

Scope IdObject IdAttributeValueClockDevice Id
scopeIdobjectId“size.width”2002A
scopeIdobjectId“size.height”2003A

Since each Swift type can be either a single atomic row, or be composed of multiple atomic rows, this lets us define an arbitrary Swift data structure that can sync atomically between devices, and the property wrapper handles the translation between Swift and Sqlite automatically.

Those changes can even be built up by multiple layers of Swift types, with CGRect being a simple example. The following pseudocode describes how changes for complex types can be built from changes to simpler types.

// pseudocode for simplicity, as attribute names would be
// separated with '.' for clarity

// CGFloats return a single atomic change
CGFloat: AtomicChanges {
    func changes(for prefix: String) -> [Atom] {
        return [prefix + propertyName: self]
    }
}

// CGPoint and CGSize return two changes - the changes from each CGFloat part
CGPoint: AtomicChanges {
    func changes(for prefix: String) -> [Atom] {
        return x.changes(for: prefix + "x") + y.changes(for: prefix + "y")
    }
}

CGSize: AtomicChanges {
    func changes(for prefix: String) -> [Atom] {
        return width.changes(for: prefix + "width") + height.changes(for: prefix + "height")
    }
}

// CGRect is built from 4 changes, 2 from each of its origin and size
CGRect: AtomicChanges {
    func changes(for prefix: String) -> [Atom] {
        return origin.changes(for: prefix + "origin") + size.changes(for: prefix + "size")
    }
}

If we had defined a Card.frame property, its CGRect type would be built from CGPoint and CGSize types, which build CGFloat Atoms.

If we had defined the Card with a single frame property:

class Card {
    let scopeId: UUID
    let objectId: UUID
    @Merged var frame: CGRect
    @Merged var board: Board
}

Then we could edit a card with myCard.frame = CGSize(125, 145, 200, 300)

Scope IdObject IdAttributeValueClockDevice Id
scopeIdobjectId“frame.origin.x”1254A
scopeIdobjectId“frame.origin.y”1455A
scopeIdobjectId“frame.size.width”2006A
scopeIdobjectId“frame.size.height”3007A

To load an object, we can search Sqlite for all rows grouped by the scopeId, objectId, and attribute and group by the clock‘s max value. The property type can then load its value in a similar recursive way to decode the Atoms into the final Swift value.

Referencing another object will simply save that object’s scope and object ids as the binary value. So setting myCard.board = myBoard will generate the following two Atoms.

Scope IdObject IdAttributeValueClockDevice Id
scopeIdobjectId“board.scopeId”board’s scopeId data8A
scopeIdobjectId“board.objectId”board’s objectId data9A

This simple Sqlite schema storing Atoms in a table and translating these to Swift types with property wrappers gives us enormous flexibility in our application’s data structures while keeping the sync logic itself very simple.

The Sync Foundation

Now we have a flexible and powerful local-first sync framework. The synced data is composed of Atoms, application-agnostic atomic changes, and from these small changes each client can resolve to the exact same state as every other client thanks to the hybrid logical clock.

From this list of changes, we can build up complex application-specific data types, and Swift property-wrappers let us hide the underlying sync implementation from the application logic.

While this post describes the structure of the synced changes into Atoms, it doesn’t describe how those Atoms are actually synced across devices. I may cover this in a future post, but the important piece is done – defining what it is that actually needs to be synced. The sync itself can happen over any number of pathways – peer to peer, a central server, flag semaphore or morse code. No matter the pathway, all synced devices will agree on the state of the application thanks to the simple sync data structure and hybrid logical clocks.

Local First at Muse

If you’ve read this far, then you’re a good fit for Muse – we’re hiring a local-first engineer!