All posts by oldbyte10

Noise

I woke up reading the wrong part of the Internet. Not wrong as in factually wrong – wrong as in something that does not help to stabilize my mind or bring me to a higher level of understanding of the world and that which is beyond our grasp.

The polar divide in ideology which I witness today is, from what I can see, a mostly American phenomenon spurred by intense political competition. While other countries have seen localized and sometimes violent conflict between two sides (often racially driven), never before has a war of information been seen at this scale and ferocity.

As someone who is acutely aware of all of this, it causes me great anxiety to think that every side and opinion has its fair share of criticism – as if everything in the world was wrong. Everything can be criticized, discredited, falsified, and undercut. And yet that is not a very helpful perspective, either – it just means that under the eyes of someone else, what we do is potentially “wrong.”

It’s difficult to step back and reevaluate why we believe what we believe, and why others believe what they believe, without necessarily demonizing or making a mortal enemy out of an “opposing camp.” And yet, it’s not possible to agree with everything – yet it is a necessity to respect those whom with we disagree.

This is the paradox of individualism and unity. The economic theory of comparative advantage requires us to work together to compose a society that runs for the benefit of all of its members; yet, individualism implores each of us to scrutinize the world under our own lenses, and to form our own opinions that may oppose the opinions of others.

Despite all of these concerns clouding my mind, somehow I can feel happy and accomplished. Somehow, I take pride in my accomplishments and rejoice in the successes of others. Somehow, I can just be myself and not worry.

A minor note on the Code Lyoko upscaling project

I’m a big fan of Code Lyoko. I watched the show as a kid back when it aired on Cartoon Network, and it may well be the reason I grew a liking for computers and why I am studying for a computer science degree today.

First and foremost: We should be thankful that Mediatoon has decided to re-release Code Lyoko on YouTube.

Continue reading A minor note on the Code Lyoko upscaling project

Using JS for Qt logic

Sorry, but I’m going to have to take sides on this one: Electron sucks. At their core, they convert a program into an embedded web browser with somewhat looser restrictions on interfacing with the host system, only to make it possible for web developers to develop for the desktop without learning a new language. The benefits of Electron are heavily biased toward developers and not the end-user, which is really the target of the entire user experience.

The problem is that HTML5 makes for an abstract machine that is far too abstract, and the more layers of abstraction that sit between the software and the hardware, the more overhead there will exist from translation and just-in-time compilation. The result is a web browser that uses 120 MB of shared memory and 200 MB of resident memory per tab.

For a web browser, though, abstraction is good. It makes development easy without sacrificing user experience for what is essentially a meta-program. But for any other desktop application with a specific defined functionality, this abstraction is excessive.

This is generally why I like Qt: because it is one of the only libraries left that continues to support native desktop and embedded application development without sacrificing performance. The one problem with it, however, is that performance requires the use of C++, and most people do not know how to work C++. Moreover, it requires double the work if the program is also being targeted for the web.

There does exist QML, which removes most of the C++ and exposes a nice declarative syntax that combines both layout and logic into a single file. However, it has two significant problems: first, it adds even more cruft to the output program, and custom functionality still requires interfacing with C++ code, which can get a little difficult.

Qt’s main Achilles’ heel for a long time has been targeting the web. There are various experimental solutions available, but none of them are stable or fast enough to do the job yet.

I’ve been coming up with an idea. Qt exposes its V4 JavaScript engine (a JIT engine) for use in traditional C++ desktop programs. What I could do is the following:

  • Write JS code that both the browser and desktop clients share in common, and then make calls to some abstract interface.
  • Implement the interface uniquely for each respective platform.

For instance, the wiring for most UI code can be written in C++, which then exposes properties and calls events in JS-land. Heck, Qt already does most of that work for us with meta-objects.

How do I maintain the strong contract of an interface? You need a little strong typing, don’t you? Of course, of course – we can always slap in TypeScript, which, naturally, compiles to standards-compliant JavaScript.

The one problem is supporting promises in the JS code that gets run, which mostly relies on the capabilities of the V4 engine. I think they support promises, but it does not seem well documented. Based on this post about invoking async C++ functions asynchronously, I think that I need to write callback-based functions on the C++ side and then promisify the functions when connecting between the JS interface and the C++ side. That shouldn’t be too hard.

Note that important new features for QtJsEngine, such as ES6, were only added in Qt 5.12. This might complicate distribution for Linux (since Qt continues to lag behind in Debian and Ubuntu), but we’ll get there when we get there – it is like thinking about tripping on a rock at the summit of a mountain when we are still at home base.

Indexing the past and present

With the shutdown of GeoCities Japan, we are reaching an important point in the history of the Internet where important historical information is vanishing while being replaced with new information that is hidden away as small snippets of information in social media systems.

It is becoming increasingly apparent that a vast trove of information is simply missing from Google Search. Aggressively pushing for well-ranked sites, user-made sites with obscure but useful information are not as indexed, and their lack of maintenance leads to their loss forever.

For instance, I was only able to find MIDI versions of Pokemon Ruby and Sapphire music from a site hosted by Comcast. After the shutdown of Comcast personal sites, the information was lost to indexing forever and hidden away in the Internet Archive.

What I propose is the indexing and ranking of content in the Internet Archive and social media networks to make a powerful search engine capable of searching, past, present, and realtime data.

A large fault of the Google Search product over the years has been its dumbing down of information during the aggregation process of the Knowledge Engine that inhibits the usefulness of complex queries. If a query is too complex (i.e. contains keywords that are too far apart from each other), Google Search will attempt to ignore some keywords to fit the data that it has indexed, which only fits into particular categories or keywords. If the whole complex query is forced, though, Google Search will be unable to come up with results because it does not index or rank webpages in a way that is optimized for complex queries – not because the information does not exist.

The corpus of information is also diversifying: there is more information in e-books, chat logs, and Facebook conversations than can be found simply by crawling the hypertext. But the Google search engine has not matched this diversification, opting simply to develop the Knowledge Graph to become a primary and secondary source of information.

I think this would be a great direction a search engine such as DuckDuckGo could take to compete more directly with Google Search in a dimension other than privacy. After all, Google Search is no longer Google’s main product.

Reverse engineering Wargroove

So far, I think I have spent a total of 8 hours working on a reverse engineering project to mod Wargroove. I thought it would be trivial, but it has been a little more convoluted than I expected due to my inexperience in reverse engineering. The end goal is to make an Advance Wars total conversion mod.

Wargroove is based on the Halley engine. It is a well-designed C++14 game engine – arguably one of the best-designed game engines I have seen to date (and that’s saying something). The CMake build process is straightforward, and most libraries can be statically linked without hassle. The engine also encourages scripting using Lua, which means that how much of the game that can be modded might be much greater than I originally anticipated (which was just image resources).

I was able to hack up Halley’s asset pack inspector to try to extract Wargroove’s assets, and lo and behold, it listed out all of the files in each pack. But there was one problem: the files it extracts were all gibberish. Naturally, all of the asset packs are encrypted using AES. That means I have to find the IV and a decryption key, which is just a string.

The IV is easy to find: it’s in the header of the asset pack. However, the key is not so easy to find: you have to take a deep dive into the executable to find it.

I’ve done reverse engineering before as part of a university course, and of course a little bit in my spare time using Radare. I thought this would be relatively easy: find the code that corresponds to the decrypt function, then find cross-references to it, tracing upward until I can find a reference to some kind of string. But while I have come close, I have not been able to find the string through static analysis.

All right, so let’s try dynamic analysis. The problem with dynamic analysis, however, is that it’s difficult to hook into the executable right on game startup due to its dependency on Steam, so I can’t put a breakpoint on the decrypt function. Instead, however, I intentionally caused an error on load time to trigger an error message, which gives me a chance to hook a debugger and inspect the memory.

My first attempt to cause an error was by modifying what I thought was a decryption key and then watching the game try to decrypt without success. I’m basically looking for a string that is at least 16 characters long, and I found some good candidates that were close to some game initialization-related strings. However, none of the keys that I modified caused the game to crash.

My second, fail-safe attempt, was to simply rename one of the .dat files, which surely caused the game to fail to run.

Yet still no dice. None of the references in the stack pointed to anything that looked like a string, except for the error message in the dialog box. It was almost as if the error message that was produced overwrote the decryption key that I needed, which doesn’t seem like it’s supposed to happen.

After a while, I considered inspecting the network to see if the decryption key was being received from a server. But the game makes zero network communication except in the multiplayer and user-generated content (UGC) modes, and besides, getting a decryption key from a server would imply always-online DRM.

While IDA has given many clues, it does not provide any useful cross-references for the supposed decryption keys that I found. I don’t believe there is anything devious going on here – the developers would not take so much time to obfuscate a single string – so I’m going to give those strings a hard pass and keep looking.

I think I will focus my efforts on finding a way to hook into the executable right at start time, so that I can lay a breakpoint right at the decrypt function, which will give me the most accurate stack trace. I think this is possible if I hook into the Steam client and then watch for child process creation.

None of this is easy, though, when you are using Steam on Wine and launching Wargroove with an old version of Proton, in OpenGL mode, with the –no-intro flag, and then simultaneously launching x64dbg from a terminal with the same Wine prefix as Wargroove, with WINEESYNC set to 1. Yikes! I should probably do the debugging on my Windows machine.

At times, I feel like hitting my head against the wall, but I have confidence that I will find the key eventually so that I can move forward.


Unfortunately, after another six hours sunk into the project, I haven’t been able to make much headway. The Steam library is interfering with the debug hook on my Windows machine. Using the Image File Execution Options in the Windows registry, I can immediately hook a debugger on process startup; however, the interruption from hooking x96dbg causes the Steam API to fail initialization with a cryptic error code, and therefore halts the entire game from starting up.

Thinking that maybe the interference was solely an x96dbg issue, I tried Cheat Engine instead. I was originally hesitant about using Cheat Engine due to my unfamiliarity with it, but nevertheless it does not seem to support just-in-time debugging since elevation is required. Moreover, on automatically detecting the process and manually pausing it on startup, upon hooking any debugger, the game crashes.

Heck, when the engine fails to load an asset pack, it even prints a stack trace in the message box! The problem is that the stack trace string seems to overwrite the decryption key that I need.

I think what I need to do is patch a breakpoint into the executable by way of an INT 3.

Okay, well, that still didn’t work. I’ll have to try again later.


5/7: Darn it! Someone beat me to it. They created a closed-source tool in Java called ModPacker. Let me find out what the decryption key ultimately came out to be:

+Ohzep4z06NuKguNbFRz3w==

The tool uses only the first 16 characters of this Base64-encoded string.

I am not even sure what process was taken to find this string. It humiliates me that I was not able to find this string, but for someone else, the process seemed effortless.

I do not feel worthy of having this decryption key, and the only gripe I have against this ModPacker tool is that it is closed-source and reinvents the wheel in how it uses its own implementation of the Halley packer and unpacker tools instead of using the official tools.

Coping with loneliness

This break, I’ve been coping with loneliness, and things have actually gone better than I expected. Fortunately, the break does not need to drag on any further – I return to college this week.

However, yesterday I was feeling particularly lonely, so I did an exercise that I like doing when it seems like there is nobody to talk to: I list everyone that I know (who is of my age), and I categorize them by connection. Continue reading Coping with loneliness

2018: a retrospective

A week ago, I wrote a retrospective in my private writings, but yesterday after rereading it, I found it profound enough to publish in this blog. Since the retrospective includes personal details, I had to omit them, but given the result ended up looking like a Mad Lib, I decided to reword the retrospective to skirt around such details. The reworded narrative is a bit nebulous and abstract, and I considered even giving up publishing it entirely, but perhaps there is some value in the end.

Continue reading 2018: a retrospective

On the versioning of large files

I’m such an idealist. It’s become clear that I can’t settle for anything other than the most perfect, elegant, long-term solution for software.

That reality has become painfully apparent trying to find a good way to track the assets on Attorney Online in a consistent, efficient manner. I just want to track its history for myself, not to put the content itself online or anything like that (since putting it online might place me further at risk for a copyright strike).

Well, that eliminates Git LFS as a choice. Git LFS eliminates the ability to make truly local repositories, as it requires the LFS server to be in a remote location (over SSH or HTTPS). It’s not like Git LFS is even very robust, anyway – the only reason people use it is because GitHub (now the second center of the Internet, apparently) developed it and marketed it so hard that they were basically able to buy (or should I say, sell?) their way to get it into the Git source tree.

Git-annex, on the other hand, seems promising, if I could figure out how to delete the remote that I accidentally botched. There’s not a whole lot of documentation on it, save the manpages and the forums, most posts of which are entirely unanswered. What’s more, GitLab dropped support of git-annex a year ago, citing lack of use. Oh well, it lets me do what I wanted to do: store the large files wherever I want.

I could also sidestep these issues by using Mercurial. But that would be almost as bad as using bare Git – the only difference would be that Mercurial tries to diff binary files, but I’d still probably have to download the entire repository todo en un cantazo.

I was also investigating some experimental designs, such as IPFS. IPFS is interesting because it’s very much a viable successor to BitTorrent, and it’s conservative enough to use a DHT instead of the Ethereum blockchain. The blockchain is seen as some kind of holy grail for computer science, but it’s still bound under the CAP theorem. It just so happens to sidestep the issues stipulated by the CAP theorem in convenient ways. Now, don’t get me wrong, my personal reservation for Ethereum is that I didn’t invest in it last year (before I went to Japan, I told myself, “Invest in Ethereum!!”, and guess what, I didn’t), and it seems that its advocates are people who did invest in it and consequently became filthy rich from it, so they come off as a little pretentious to me. But that’s enough ranting.

IPFS supports versioning, but there is no native interface for it. I think it would be a very interesting research subject to investigate a massively distributed versioning file system. Imagine a Git that supports large files, but there’s one remote – IPFS – and all the objects are scattered throughout the network…

Well, in the meantime, I’ll try out git-annex, well aware that it is not a perfect solution.

Paranoia

This morning, I received a “boil water” notice from the university. I immediately searched the news to investigate the exact reason – is the water contaminated, and what is it contaminated with?

However, all that I could find were two vague reports from city officials about how the treatment plants were overloaded from silt due to flooding, and that Lake Travis was only four feet away from spilling over the dam. Pressed to maintain a water pressure adequate enough for fire hoses to remain usable, the city decided to “reduce” the treatment of the water to allow enough water to be supplied, such that it is no longer at the “high standards” that the city provides for potable water.

But water treatment systems are not a black box; they are a multi-stage process! What stage of the treatment was hastened; or are stages being bypassed entirely? Surely, the filtration for particulate matter is being reduced, but the chlorine process should still be keeping the water sterile. However, none of these questions can be answered due to the vagueness of the report.

Affected treatment plants? Undisclosed. Particulate matter and bacteria reports? Nonexistent, assuming the Austin website actually works right now, which it does not.

Here is the main contradiction in their statement:

WHY IS THE BOIL WATER NOTICE IMPORTANT Inadequately treated water may contain harmful bacteria, viruses, and parasites which can cause symptoms such as diarrhea, cramps, nausea, headaches, or other symptoms.

But earlier in their statement, they stated the following:

It’s important to note that there have been no positive tests for bacterial infiltration of the system at this time.

So what bacteria am I going to kill from boiling water?

All that I can conclude is that the city of Austin is spreading fear, uncertainty, and doubt of the water quality simply to reduce stress on the system, without presenting hard evidence that the water is indeed unsafe to drink. Boiling water will not eliminate particulate matter, and from the aforementioned press release, “city officials” (whoever those are) have explicitly stated that bacteria has not yet contaminated treatment plants, so there is no bacteria to kill from boiling water.

One benefit to treatment plant operators from this warning, however, is that they now have free reign over which stages they wish to reduce or bypass, including the disinfection stage. However, due to the lack of transparency, there is no information to ascertain which stages are being bypassed – the water can really be of any quality right now, and it could even be still perfectly fine.

My questioning of this warning stems from a fundamental distrust in government decisions and communication to its citizens. People simply echo the same message, without seeming to place much thought into it: “Boil water. Boil water. Boil water.” And on the other hand, city officials might state that the treated water is completely safe to drink, despite findings of statistically significant lead concentration in some schools!

I’ll comply out of an abundance of caution (and because noncompliance has social implications), but mindless compliance and echoing of vague mass messages should not be the goal of the government. Individuals should be able to obtain enough information to make an informed decision and understand the rationale of the government in its own decisions.


It is now the next day since the announcement of the restrictions, and the technical details surrounding the problem remain vague. It seems that the restriction has indeed granted free license for treatment plant operators to modify treatment controls as they see fit, without necessarily needing to meet criteria for potable water. Moreover, it appears that the utility has known about this problem for quite some time now, and only now have they decided to take drastic action to prevent a water shortage.

I would not trust this water until the utility produces details of actions being taken in these treatment plants to fix this mess up.