Chrome and Punishment: Browser Tests and You

Browser testing is a popular strategy for web application developers to verify their programs from the user’s perspective.

However, automating this process has always been challenging for teams who depend on Continuous Integration (CI). The environment isn’t always suited for testing a JavaScript-heavy web application due to demands on memory and network throughput.

With the release of Headless Chrome, there is hope.

Let’s explore why this is a problem and how Chrome helps to solve it. In this post, we’ll look at browser testing for web applications and how we got where we are today. In particular, we’ll look at common problems experienced by Rails developers and how they can take advantage of this new technology to improve their build performance and reliability.

Before we get into that, let’s make sure we have a good understanding of what browser testing is and why you need it.

User-Driven Development

Trying out your application by hand is often the first step to verifying that your program will work as expected. When building an ecommerce web application, as we all did in the early 00’s, this would mean adding a new item to the store (likely via a content-management system interface), and refreshing the page to see it listed on your inventory page.

When setting up your test suite, unit tests are often written first to test the smallest individual parts of our programs. We programmers love to stress writing small testable code. In contrast, system tests and acceptance tests are designed to ensure that when our users actually use the program, it works as expected. Arguably, this is more valuable to your customers than whether or not your functions are pure: they’re more likely to care that the checkout page actually works when they want to buy something from you.

Who doesn’t love firing up Firefox and clicking around to make sure you didn’t break anything before “going live”? I think fondly on the days of building my first web site, a Star Wars encyclopedia of sorts. Adding a new piece of HTML and refreshing the browser to read tales of Wedge Antilles would blow my mind.

Many Hands Make Light Work

Manually clicking around your web page every time you change something may have been fascinating at first, but it certainly doesn’t scale. Before there was automated browser testing, QA scaled in terms of humans. This work was often performed by large QA teams in the enterprise, whose sole purpose was to use the program and try to break it in new and interesting ways. In fact, there are still enterprise companies who employ this method today..

Senior QA Engineer circa 2004

Around the time of the “Web 2.0” era, programs that required a browser (aka web applications) were booming and companies needed a more efficient way of testing their apps. Then came the “Scriptable Browser”, made popular by Selenium as a way to automate the operation of a web browser.

The scriptable part being that you were able to perform exactly the same actions that you would usually do manually with a keyboard and mouse in a declarative programming language.

This brought an entirely new generation of software testers, people were able to record their session: for example, logging in and purchasing some item on the home page. They could then easily rerun these examples automatically, ultimately saving developer time and allowing them to perform more efficiently.

Out of Sight, Out of Mind

There’s no doubt that many of these jobs were replaced as continuous integration systems gained in popularity… The advantage CI has over your army of “happy clickers” is that your “checkout” test can be executed automatically on a server which is on the average much cheaper than a full-time employee.

As demand for CI increased, the technology shifted to support running browser tests on a machine that likely wasn’t plugged into a display.

For sysadmins the X virtual framebuffer display server, or xvfb for short, was the answer. Designed to implement the X11 display protocol commonly used by most Linux variants allowed you to run any program that normally required a GUI, such as a web browser, where a text-only interface was available.

With Great Power…

Unit Tests are great for testing specific parts of your code, where most of the work is mocked or simulated to increase throughput. Integration tests, which hit the entire system, demand more of the hardware. We typically feel this pain with front-end heavy web applications.

As our applications continue to get more and more complex, they put increased stress on the system’s network and memory. Since the difficulty of browser testing is largely attributed to resource scarcity, it’s easy to place blame on JavaScript, but that doesn’t paint a complete picture.

As the demand for browser testing has increased, we’ve seen the technology keep pace. Suddenly the original browser automator, Selenium, was not alone. Chrome and Firefox both picked up compatible interfaces for automated operation, which are currently being standardized by the WC3 as a specification for remotely controlling web browsers.

The list also includes tools like PhantomJS, which adopted the engines behind Safari, namely Webkit and JavaScriptCore. This allowed developers to simulate a “headless” browser without having to load all of Safari, keeping its footprint much smaller.

However, frequent crashes and a history of memory leaks had, over time, impacted productivity and consequently developer happiness.

When running in a CI environment, memory issues and network timeouts can have a major impact on the determinism of your build. This is often referred to as flakiness and can waste developer time leaving them to “retry” the entire build until it passes and they can finally merge their work, allowing them to move on to the next task.

At the time of this writing there are over 216 issues mentioning “memory” on the PhantomJS official bug tracker, and over 200 mentions of “phantomjs” between our public forums and support center where people often report build failures.

That is not to say that PhantomJS is bad software. No, it turns out this is actually a really hard problem, and near-impossible to achieve for a community-maintained project.

A New Hope

Now you can see why we were so excited when Google announced that Chrome version 59 would be released with a --headless flag to run the browser from the command-line.

There are a couple of advantages Chrome has over Safari in the realm of browser testing.

Google, along with contributions from other distinguished web companies, forked the layout engine from WebKit. They have continued to make improvements leading to adoption in frameworks such as Electron for building cross-platform desktop applications using JavaScript, HTML, and CSS.

This means if you’re building apps for the desktop using Electron, testing has just gotten much simpler.

Secondly, it turns out that Chrome is used a lot, somewhere in the ballpark of 50 to 60% of the entire browser market share, according to StatCounter. Safari is at 15%. Even with some deviation, that is still a large difference between the top two browser in usage and a factor when testing your application.

It stands to reason that you should test on the browser that your users use the most! This supports the argument for system tests in the first place: test your program using the client-interface can help identifies bugs before they’re reported.

Lastly, the Chrome team puts developers first in some meaningful ways. Even though both browsers ship with their own “tools” for debugging and profiling your web app, Safari doesn’t enable them by default. While Apple may be more focused on making macOS and iOS integration its main priority, the Chrome team is focused on shipping the fastest and lightest browser designed to run everywhere.

You can even install Chrome on Linux, where your tests are also running. Last I checked, you can’t install Safari on Linux without Wine.

Integrating Headless Chrome into your… Integration Tests

Hopefully I’ve done my job at convincing you to try Chrome in headless mode, but don’t just take my word for it. Because no matter how shiny a technology is, unless the community backs it, shifting over can seem dubious.

Since the release of Chrome 59, support for the headless version has made its way into many frameworks and ecosystems, including puppeteer which is a node.js package developed by the Google Chrome team themselves. We also can’t forget the juggernaut Ruby on Rails which replaced PhantomJS as it’s default for system tests, designed for end-to-end testing backed by Capybara: an acceptance testing framework.

With the coming Rails 5.2, now in RC, enabling headless Chrome is just 3 lines of code:

class ApplicationSystemTestCase < ActionDispatch::SystemTestCase
  driven_by :headless_chrome

There is even a patch to add support in Teaspoon, another popular JavaScript test runner for Rails applications.

As adoption for headless Chrome grows, we’re curious if you’ve found a case where it was recently added to your favorite ecosystem. Let us know on our community forums and tell us how it works for you!