In late 2019, the Interchain Foundation and Katalysis agreed to bring the Cosmos SDK to Swift. We aim to enable the building of a fully fledged blockchain application using the Tendermint consensus in Swift. For our use case, the specific target platform is Linux, however, our intent is also to also eventually provide support for iOS.
The porting of the Cosmos SDK to Swift has been planned in three parts, each of which builds one the previous ones.
The first milestone is all about developing the structures to store the state. This is provided by Merkle trees. In Tendermint, and by extension the Cosmos SDK, the state is stored on iAVL+ trees². iAVL+ trees are AVL trees with two additional properties: (1) they are merkelized³, and (2) the nodes are immutable allowing their full history to be kept.
The second milestone aims to implement the higher level building blocks for CosmosSwift, such as the framework connecting the various modules comprising a blockchain app (Framework, Store, Bech32) and some basic modules as well (Auth, Params).
Specific goals for the first milestone
While designing the implementation of the first milestone, we set ourselves the following goals:
- Think about the “Swiftiness” of the implementation as much as replicating a set APIs. In particular, refrain from mapping one to one the Swift implementation to the Go implementation.
- The implementation should allow (1) flexibility to choose Storage for our iAVL tree, (2) flexibility in the Encoding for serialisation/deserialisation of tree nodes and proofs between nodes, and last but not least, (3) flexibility in the keys, values and hashing functions used for our tree payloads.
- It should have strong typing and avoid optionals as much as possible, as well as provide strong immutable semantics.
Some of the design decisions deriving from these goals will likely evolve as we start using the artifacts of this first milestone to build the core parts of the framework as part of our next milestone.
Nodes and Trees
In order to provide the flexibility of storage type and behaviour on the one hand and the flexibility of encoding on the other hand, we make heavy use of Swift protocols. In fact, we define two central protocols, the `
NodeStorageProtocol and the
NodeProtocol defines how tree nodes make the tree and allow navigation through the tree. Nodes are intended to be immutable such that the integrity of the hashes is preserved by design. A root node is therefore the entry point to an immutable tree structure, equivalent to the
NodeStorageProtocol defines how a tree is built and maintains its iAVL properties of being balanced and properly secured through its hashes. In fact, custom storage implementations are not expected to implement any of the logic for adding or removing nodes from the tree, nor any balancing logic.
Implementing these two connected protocols allows us to provide a variety of storage implementations for specific use cases. As a starting point, two storage implementations are currently provided.
The first one is an in memory implementation of the tree, where sequential versions of the tree are committed. While the two core protocols
NodeProtocol comply to
Codable and could be stored on disk as a file, the
InMemoryNodeStorage and corresponding
InMemoryNode classes are not intended for persistent storage.
The second one is an SQLite backed storage, either in memory or persisted to disk, and it provides lazy access to (partial) versions of the tree stored on disk. The
SQLiteNode classes do not provide any elaborate node cache management nor any pruning strategies. However, such features are very simple to add to these classes.
NodeProtocol describes the structure of iAVL+ nodes (both leaves and inner nodes) and how nodes tie to one another conceptually. We define two levels of primitives. At the lowest level, we have the following core primitives:
We then use them to implement higher level primitives⁵ to access the intrinsic properties of leaves (hash, key, value) or inner nodes (hash, height, size, balance, left node, right node), to traverse and iterate over the tree or to find nodes in the tree.
This allows us to enforce how nodes describe the tree structure and the tree changes over its lifetime.
A node instance is expected to be immutable. Any change to the structure of the tree happens by “orphaning” a node and replacing it with another one. Given that this may require performing certain storage operations (due for instance to caching changes before storing them to disk), all operations related to constructing or updating the tree are delegated to the
While the protocol does not enforce immutability of the structures conforming to the protocol, we expect that a node’s
size are set upon creation of the node.
NodeProtocol compliant structure also benefits from the implementation of the Merkle proof for a given leaf node or a range of leg nodes.
NodeStorageProtocol lives hand in hand with the
NodeProtocol. It abstracts how the tree is built and maintained, and by implication how and when the nodes are created and stored. There are three important tree level operations which are provided for structures conforming to the
recursiveRemove() are respectively setting (adding or updating) or removing nodes.
balance() performs the balancing of the tree upon setting or removing of nodes. It is interesting to note that the balancing algorithm has been unrolled (we do not factor out
rightRotate() functions) to remove the need for creation of temporary nodes⁶.
Having these primitives as part of the
NodeStorageProtocol instead of the
NodeProtocol allows that structure to keep track of the changes to the tree between versions.
A structure complying to the
NodeStorageProtocol is the entry point to versioned iAVL+ trees. Therefore the protocol provides ways to access the root node at a specific version. The protocol also provides, as syntactic sugar, access to the underlying node primitives to search and traverse the tree (which are provided by the
Node associated type implementation).
Keys, Values and Hashers
NodeStorageProtocol and the
NodeProtocol make use of three associated types. These types are
Providing associated types gives us the flexibility to allow specialisation of the Storage and Node implementations to the specific uses cases we are working with. In particular, it gives us the capability to delay the choice of specific data structures until we have the need to optimise them.
Keys and Values
Keys to be
Comparable, so that we can decide where a new value will be added in the tree.
Compliance to the
Codable protocol for both
Values is to allow serialisation and deserialisation of the tree for the purpose of sharing the tree or parts of it across blockchain nodes.
Finally, compliance to the
DataProtocol & InitialisableProtocol is required to allow both
Values to be implemented using optimised bytes type of structures if needed. We basically want to leave the specific choice of structure for bytes to the implementer of the tree.
Hasher complies to the
HasherProtocol which encapsulates the hashing function required to calculate the Merkle hashes for each node (depending on whether the node is a leaf or an inner node).
HasherProtocol also provides a
Hash associated type. Between
Hash, this allows us to clearly express and statically check the different types used throughout the tree implementation.
A default implementation (as provided as part of the test suite) uses a SHA256 hash, however, moving to any other implementation is a trivial specialisation of the Node (and therefore Storage) using the relevant structure implementing the
Putting all of this together
To bring this all together, we will show in an upcoming post how to implement a custom Storage. For instance, one which provides a configurable caching strategy. As part of the caching strategy, we could envisage limiting the amount of memory used, or providing a sparse persistent storage of versions of the tree.
The code lives in the CosmosSwift github organisation, and is released with an Apache License.
We welcome contributions from anyone, so long as they respect the rules set forth in each repository. In the next few months, however, we expect the code to evolve rapidly⁷ and would request contributors to bear with us until we can freeze the APIs.
In the meantime, feel free to reach out to us with questions, suggestions, and feedback. We will be delighted to hear your views!
You can contact us via our dedicated Github repos, or by email at CosmosSwift [_a_t_] katalysis.io.
: The implementation is partial due to the uncertainty around Amino over the last months. In itself for a fresh implementation, we do not believe it is a problem as Amino is used for two purposes: 1. Hash generation for the tree nodes, 2. Serialization and deserialization of internal structures. The former is important for tree consistency and the later for inter node communications.
: The Tendermint implementation can be found here: [https://github.com/tendermint/iavl]
: A merkelized tree is a tree where each node contains a hash based on its direct children in the case of an inner node and its payload in the case of a leaf. (see Wikipedia/Merkletree)
: The specifics of which parts of the IBC protocol are dependent on where the IBC specification is at time of implementation.
: The current implementation mostly discourages own implementations of the primitives we provide as they are defined as static dispatch. Depending on the use cases, we are open to moving them in the main protocol requirement at a later stage.
: Furthermore, it allows us to enforce the calculation of the hash, height and size on creation of the node rather than delay until we know that the node is really part of the final tree.
: Our current target for completing milestone 2 is Q320, and for completing milestone 3, Q420.