CircleCI for Mobile Testing

Earlier this week, we announced CircleCI for OS X. This new offering allows iOS developers using CircleCI to build and test their code for iPhone, iPad, Mac, and Apple Watch apps with significant improvement to the iOS development cycle.

The demands for mobile app development can be more complex then their web counterparts, while games generate significantly more graphic content than traditional mobile applications. Tests for traditional applications can be tied to known content being present on the screen, or, more specifically, certain elements with pre-configured IDs being visible after a set of application interactions. With the mobile games, though, writing such tests becomes an increasingly difficult task.

Testing graphical representations is hard

complexity of in-game graphics

Most of the testing mechanisms designed to interact with the graphical representation rely on either searching the view tree or on analyzing the screenshots. This works great if there is a single way to represent the state of the application that is being tested—and roll back that state after the single test

With mobile games, a valid state might actually be a set of states, as the mechanics of the game might include multiple views of the same scene, multiple ways to get to the same result, and varying interaction from the outside world (AI). Even if it is possible to get to the desired state just by interacting with the control elements, it might take a significant amount of time to get there on a mobile simulator or emulator.

Do you have a test for that tube on the ground on the top of the screen? Screenshot from Space MarshalsDo you have a test for that tube on the ground on the top of the screen? Screenshot from Space Marshals.

Additional layer of abstraction will save us

As the representation layer in a game is much more versatile, more flexible, and therefore less testable than the graphics of a traditional application, it might be hard to just test all the graphics themselves with traditional XCTest approach.

However, things get simplified quite a bit if an intermediate layer is invented—a concept of the game state. Focusing on generating, modifying and testing the game state separately from the rendering mechanism provides the necessary flexibility. Focusing on the intermediate state allows you to test the actual logic of the game with standard testing utilities, and then have a separate test suite just for the transformations of the game states into actual images on the screen by the rendering classes.

If the rendering logic is only available on real devices

Testing the actual rendering functionality might not always be possible if the game is running on a simulator / emulator. For example, code written for Apple’s MetalKit can only be run on real devices. This means that you will have to skip the rendering steps when running tests on CI without real devices.

Good news is: this will be easy to achieve if you decide to abstract the game state away from the rendering logic. Once the logic is separated out, you can just stub those classes out in case the build is targeting a simulator / emulator and actually run the full unit test suite on CI. This blog post suggests a neat way to do so for MetalKit.

End-to-end tests

If the rendering logic still works on real devices, you might be able to run the full end-to-end tests for your rendering logic. After ensuring the correctness of your game-state-interacting layer, you can then proceed to actually rendering a few different game states and capturing the screenshots of the result.

If the rendering layer is actively being worked on, the actual comparison of the screenshot and the etalon will have to be done by a human, but if most work is being invested into the game logic, or performance optimization, or a new storyline, ensuring that the result of the rendering functions did not change is a sufficient measure.

Non-game parts of the game

Most games will have a set of non-game logic like settings screens, menus or scene selections. Those can be tested using the standard XCTest-like logic, as there is a single correct way to represent a stable set of information.

If the game involves any kind of backend processing, that can be tested separately using the traditional testing techniques for the desired stack.

Perfect continuous integration for games will rarely be possible

As there are no complete automated tests for the rendering layer, you will have to test that yourself and keep testing it throughout the development phase.

One of my colleagues once visited a game development studio where an Xbox was installed in the office, and that Xbox which was not being used most of the time (which makes sense). When nobody was using the Xbox, the game that was in development at that time would be automatically loaded on the device, the Invincible mode would be turned on for the main character and it would then perform a full walkthrough of the game. The developers hanging out near the Xbox would notice rendering layer issues just when passing by the screen—aspects like rendering of parts of the game world not be working, or colors being off, or certain aspects of physics seeming unrealistic are easily noticed.

It will be hard to ever replace the human-driven QA process, but it can be moved as high up the stack as possible to save the developers’ time and minimize the risk of human error.