Rethinking Attorney Online assets

It seems fairly obvious that FanatSors was not expecting a complex level of gameplay when he released the Delphi-made Attorney Online back in 2012.

Yet AO still exists, with about 150-200 daily players who frequent the couple dozen servers, with the most popular being /aog/’s Attorney Online Vidya. OmniTroid’s Qt-based, open-source AO2 client is now the de facto client, touting “advanced” features such as color, Unicode support, and parametrized preanimations. Likewise, the open-source tsuserver3 and esoteric-but-still-open-source serverD are the two choices of hosting for AO. Today, however, the legends of FanatSors and OmniTroid have faded away.

https://i.imgur.com/V30YtuL.png
An overview of the Attorney Online family.

Most players and case-writers are regularly impacted by the technical limits and quirks of the engine. Configuration of each character is all done in a single INI file, defining each emote as an octothorpe-delimited sequence of animations to be played. Each animation refers to a GIF file prefixed by an (a) or (b); that is, the format and naming scheme must be precise in the file system level.

This is not the main challenge, however. The challenge is managing assets.

Every server asks their users to download an archive, spanning up to 7 GB in size, containing character sprites, music tracks, backgrounds, evidence images, and sound effects that may be needed during gameplay. Assets are only identified by their folder name; this is the only unique identifier attached to an asset. These are the problems with using an internal name as the sole identifier:

  • Two servers may offer different content, but under the same internal name, causing a hard clash. Content could be isolated per-server, but this causes a serious redundancy problem.
  • Two servers may offer the same content, but under different internal names. This causes excessive redundancy.
  • Two servers may offer just about the same content, with a small difference. In this case, there is no hierarchy established as to which one is derived from the other one.

Upon requesting a character list for my proposed new standard base, nuVanilla (and receiving a monumental list!), it felt that the amount of dimensions that the assets needed to be examined in were too many to use a conventional spreadsheet for, so I opted for a full-blown database. My choice was split between MySQL/MariaDB and PostgreSQL, but I remembered that I wanted to learn Postgres, as the performance and versatility claimed to be far greater than MariaDB could offer.

One immediate issue is the sheer number of many-to-many relationships that manifest, causing an effective decoupling of many columns:

  • Multiple packs can include the same asset.
  • The same asset could be under different internal names.
  • Multiple assets can have the same internal name.
  • Multiple assets can represent the same character.
  • Assets of the same character can come from different games.
  • Assets can be in different formats, such as 256×192 or even 1280×720 (yes, some people resize their sprites to match their theme’s viewport size).

How do I represent uniqueness of assets, then? I can’t even hash the char.ini, because the char.ini contains the internal name of the asset. What’s more, there is no standard way to hash multiple files at a time; in this case, I would want to hash all of the emote images at the same time. (For now, however, I am hashing the char.ini until I find an adequate solution.)

One solution would be to give every asset a UUID. This would, in theory, add an additional layer of “uniqueness” into each asset. However, this still does not resolve the original problem: two assets with the same content but different internal names would still be detected as “different” upon submission, since the hash of each char.ini would be different. And this would compound a new problem on top of the old one: modifiers of an asset would be burdened with updating the UUID of the asset they are editing; forgetting to update it could only cause an error when uploading it to some centralized database.

What modifications can be done to an asset?

  • A small correction to frame data – minimal change
  • Emote additions – significant change
  • Internal name change – minimal change
  • SFX name change – minimal change

Three out of four of these changes are minimal changes. Thus, it would not make sense to consider them completely different assets. We can try to establish a hierarchy of assets, to see which asset succeeded the other, but that is no substitute for a diff. The data stored remains redundant.

Therefore, I can conclude Attorney Online assets cannot be accurately uniquely identified for management, and attempting to set up a database to manage them would take me nowhere.

I should then refocus my efforts to designing asset structure in Animated Chatroom.

Each asset would have a definition file (such as char.json), which would state the name of the character, its ancestors, and a reference to the sprite file. The format of the definition file would probably be JSON, while the sprite file would then be written in something like Spritelang.

The asset is then bundled using tar and signed using GPG. This verifies the identity of the packager of the asset (for increased trust, the packager’s key can be cross-signed by a responsible admin, who in turned is cross-signed by the Animated Chatroom Root Key. All of these keys can be uploaded onto a general-purpose key server, like pgp.mit.edu. The signature of a package need not be specifically signed by the AC official root key; just a key that is cross-signed by someone on the keychain. The absence of a signature does not mean that the asset contains malicious content and therefore cannot be trusted; rather, the purpose of the signature is to assure that the contents of an asset have not been modified, and to seal the credits of an asset. After the tar has been signed, the tar can then be compressed in a desired format such as xz (which is basically 7-Zip but using a byte stream as opposed to an embedded archive). In this case, the unique identifier of the asset is the key ID of the GPG signature. This is the strongest possible hash: not only is the data factored in, but also the identity of the individual who authored the asset.

Now that we have established a strong, unique identifier to our data, we need to solve the data redundancy problem.

Children of ancestors use an incremental tar file, which minimizes the content stored in the child. Even deleted content can be tracked if incremental tars are applied correctly. We may also employ a similar scheme as the Docker Image Specification (version 1), but it’s clear that someone did not read into tar’s ability to do incremental archiving out of the box. (“NOTE: For this reason, it is not possible to create an image root filesystem which contains a file or directory with a name beginning with .wh.. ” It requires little thinking to realize that this is a tremendously absurd limitation, just to add the ability of identifying deleted objects. If anything, deleted files should have been put in a separate file, for the sake of not introducing an artificial limitation to the file system.)

Incremental versioning has a major implication for authoring assets: authored assets are immutable. No, sir, the Animated Chatroom authoring tool will not allow you to modify an asset that has already been successfully packaged and signed, unless you choose one of the following options:

  • Create an asset that is a child of the asset you want to modify. This is not a favorable option if you have not published the parent asset. However, the authoring tool will set the hierarchy up for you.
  • Create a new asset, derived from the data of the parent asset. There is no hierarchy established, it’s just a hard copy of the parent asset. This is favorable only when the parent asset has not been published.

Acquiring modified versions of an asset would be simple under this system:

  • The desired version of the asset is downloaded and unzipped. (We don’t need to untar the entire asset yet.)
  • The signature is verified, and a warning is displayed if the signature is invalid.
  • The definition file is untarred and checked for an ancestor. If an ancestor exists, a recursive download request is made on the ancestor.
  • The asset contents are untarred on the target folder.

Asset servers for Animated Chatroom web clients can establish this hierarchy – without the expected redundancy! – by using symbolic links to represent files that are identical to the parent.

Finally, we can track what assets we have downloaded and what asset repositories we are currently using, by storing local data in an SQLite database file.

Instead of a name-based character list, servers use character IDs to disambiguate between different versions of the same character. A server can then offer download sources for specific characters, such as if a character was made “in-house,” so to speak.

What are the improvements over this design over the old design devised by FanatSors?

  • Authorship is immutable. This is important mostly for original content: repository owners will refuse to publish assets that refuse to identify the original creator of content. However, in the case of ripped content, it is desirable to preserve information about who ripped it, but ultimately, it is all copyrighted by the publisher of the game (Capcom, Chunsoft, etc.).
  • Downloading is automatic. Under the old system, players were too lazy to download zip files to play on new servers, and server owners were too lazy to compose zip files for players to download every time new content is suggested. Now, server admins need only do a graphical lookup of the assets they want to add on the server, confirm the additions, and the new content is immediately requested for download by clients, all seamlessly and in the background.
  • Name clashing is no more, since we established that internal names are no good as a unique identifier.
  • Asset content is deduplicated (to the best of the ability of this system).
  • Asset management is decentralized. I don’t own the database – in fact, nobody does. You can host part of the repertoire of Animated Chatroom content, but you can never host all of it.
    • On a similar note, this makes Animated Chatroom effectively immune to cease-and-desist notifications. I can take down the offending content on my servers, but due to technical restrictions, I cannot be responsible for the content hosted by other servers. The cease-and-desist notifications would have to be sent to each offending server.

This concludes an overview of the proposed design of asset management in Animated Chatroom. I hope you found this design enlightening for any future adventures in software development.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.