The Squall assets sync library

One of the latest additions to the tools used by the Worldforge project is the Squall library. This small library aims to provide a one-stop solution to Worldforge for any kind of assets synchronization.

We had a couple of different use cases that we wanted to support.

Our clients should be able to provide a base set of assets. But each server should then be able to update the assets used.
We think that it should be possible to create many different games and worlds with Worldforge. And since our assets all are free we think that most worlds would reuse a lot of the same assets, while adding a couple of new ones. It would be nice if the client could reuse any assets it already had downloaded.
We’ve put a lof of efforts into allowing our game servers to be updated live, whilst running. It’s possible to add and alter rules as well as the world in general. This should also apply to any assets; if assets are added, removed of modified this should automatically be reflected in any connected client, without the client having to reconnect or restart.
If possible we would want to avoid creating a complex protocol for synchronizing assets. There’s a value in having less complexity added to the existing protocol.
There are already a slew of well-used network protocols that can be used for transferring files. For example HTTP. We would like to be able to reuse these, which also would allow for the usage of standard web servers for hosting content.
It would also be nice if the tool was usable outside the Worldforge ecosystem.
To avoid complexity we wanted to treat asset synchronization as a binary state. Either the client have all required assets, or they don’t. We don’t want to handle any intermediate states.
Listings of files should use a simple text based format, just to make it less complex.

All of these different requirements led us to focus on making a simple protocol and solution. Our goal is the transfer and synchronization of game assets, which all are expressed as files. Each file isn’t overly large.

Taking inspiration from existing solutions such as Git and BitTorrent we’ve created a system based on hashing of file contents. Each asset, defined as a file, has a unique hash generated. Synchronization of assets is a simple action of checking if the client already have the specific hashed content stored. It not the client fetches it from the remote.

Hashing also extends to directories, where the hashed value of each directory entry is used as input to the final hash value of the directory. This creates a hierarchy of hashed values, where a change in a child item is propagated up towards the top level. This very similar to how Git calculates any current hash based on the complete history.

Since the hash of each directory is dependent on all of its children we have a nice way of communicating the complete “state” of the assets by just specifying the hash of the top level directory. The clients then check if they already have this asset in their local storage. If not they need to fetch it from the server and then go through all of the assets described in the listing.

It’s perhaps easiest to show this with an example. Consider a simple file tree of assets like this one below. Directories are marked with asterisks (*).

- root*
  - fur.png
  - grass.png
  - *materials
    - wooden.material
    - leaves.material
    - *rocks
      - shale.material
      - shale_texture.png

For each file we’ll calculate a hash. In this example we’ll simplify the hash to two characters.

- root* : A1
  - fur.png : A2
  - grass.png : A3
  - *materials : A4
    - wooden.material : A5
    - leaves.material : A6
    - *rocks : A7
      - shale.material : A8
      - shale_texture.png : A9

When the client connects to the server it receives information on the top level hash, which in this case is “A1”. If this already exists on the client we know that the client already has all data it needs. If not, the client begins asking the server to transmit the data needed.

If we now make any alterations to the data, for example editing the “shale.material” file, a new hash will be generated for that file. This in turn trigger the parent directory to also update its hash, along with all directories all the way up to the root. It might look like this now.

- root* : B1
  - fur.png : A2
  - grass.png : A3
  - *materials : B4
    - wooden.material : A5
    - leaves.material : A6
    - *rocks : B7
      - shale.material : B8
      - shale_texture.png : A9

Instead of transmitting that we altered the “shale.material” file we’ll just transmit that the “root” directory now has a new hash (“B1”). This will trigger all clients to ask for new data in turn, which will make them discover that “shale.material” has been updated, and act accordingly.

Likewise, if we add or remove any file.

The end result is quite nifty, in that we now have a simple protocol for transmitting assets state.

The code is available on Github. We allow the Squall library to be built either as part of Worldforge or as a separate component. We’ve also licensed it under the MIT license to make it more available.