Appium Tests in CI

The Countingup app is how our customers access their business current account and accounting as part of our mission to help small business owners. Using the app is how customers sign up for an account and manage their transactions, invoices, bills, etc.

That means that it's critical for our engineering team to be able to ship changes whilst having the confidence that their improvements haven't caused bugs or regressions.

In the bad old days™, this would likely have required a team to execute manual regression tests before a release to ensure nothing had broken and no behaviour had changed. Discovering issues could lead to weeks of bug fixes and further regression tests, delaying releases.

In other blog posts, we outlined how Countingup has a strong culture around testing. Our confidence when deploying our code comes from comprehensive testing at all levels - unit tests, component tests, integration tests, and end-to-end tests.

For our mobile app, we implement the end-to-end tests using Jest and Appium. I'll dig more into those tests and how we run them in our CI pipelines below.

Appium 📱

Appium is an automation framework that we can use to exercise our applications in end-to-end tests. It's built on top of the WebDriver protocol and can be used to test across multiple platforms (although we only run the suite in iOS at the moment). We can write tests to build coverage of vital workflows in the app to have the confidence to ship changes frequently without impacting our customers.

A quick primer

For us to successfully run tests built using Appium, we require a few things:

a local version of the Countingup backend running in Docker that the mobile app can connect to
a running iOS simulator
a running Appium server
a NodeJS implementation of WebDriver - WebdriverIO
a session profile that tells Appium how to interact with our app and the capabilities required
some tests to run that we write using Jest

Once we've prepared all of the above, we can write tests that verify the app's behaviour. The tests can set up any fixtures that a test case might need using the fake backend and check that everything is working as we expect.

What are we testing

Running the end-to-end tests incurs the costs that you might expect as you implement tests that are further up the testing pyramid. The tests are slower to run, harder to write and require a more complicated setup than a unit or component test. However, they are vital in exercising critical parts of the system, so we want to use them and make sure that the test cases we write are valuable.

Generally, we aim to have our end-to-end tests check happy paths. For example:

Check that the signup flow works
Check that we can issue an invoice
Check that 3D secure functionality for cards is working

For test cases that check validation or negative handling, we prefer component testing with React Native Testing Library or unit testing where appropriate. As these tests are faster to run, it is better to do more testing of edge cases and non-happy paths here if possible.

Running in CI 🤖

When running on an engineer's laptop, it's easy to set up the environment needed to run the Appium tests - you spin up all the requisite parts using a few commands:

 make dev-up
 yarn appium
 yarn start-appium
 yarn e2e

That will:

spin up a dev environment using Docker
start the Appium server
start the app using Expo and tell it to talk to localhost
run the end-to-end tests

You might notice the use of make above instead of yarn for the first command - we have a common toolset called build-toolkit that we use across all of our repositories. It allows us to configure what Countingup services we need to run for each project. As we need to do this in Go and TypeScript projects, we decided to use make as the basis of the toolkit so it would work for everything we code.

In this case, make dev-up will start all of the services we need for our mobile app to run in Docker to run our end-to-end tests successfully! In CI, it's a different matter...

We use Semaphore for our CI pipelines. To run our Appium end-to-end tests, we can use their macOS virtual machines (VMs) to run the simulator, Expo, and the Appium server.

However, we need our backend services to be running in Docker. The Semaphore macOS VMs do not come with Docker installed, which means we have a problem.

How to solve a problem like Docker 🐳

Finding a potential solution was pretty easy - our backend services all have end-to-end tests that run in CI. They use the Linux VMs and those come with Docker installed! That led us to the idea of our Appium end-to-end tests running on two VMs - a macOS VM and a Linux VM.

We could run our fake services on a Linux VM and run all the other required tools on the macOS VM. However, there was still a problem - the two VMs couldn't communicate directly.

We needed to find a way to allow the two to communicate so that our end-to-end tests could orchestrate fixtures in the backend services and so the app running in the iOS simulator could talk to the backend too.

NGROK ↔️

To solve this problem, we decided to use NGROK to create a tunnel between the two servers. NGROK allows us to connect local ports on a VM to the internet. We can open a tunnel from the Linux VM that the macOS VM can then connect to.

We can create a proxy connection on the macOS VM to route all of the traffic from the app and end-to-end tests via the NGROK tunnel to the backend services running in Docker on the Linux VM!

Our tests can then run as if they are running on the engineer's laptop, and we can run our end-to-end tests in Semaphore.

Running the tests 🏃‍♂️

We know that when we run the end-to-end tests on our laptops, they take a while to run. The same is true with Semaphore.

Ideally, we would want to run the tests as part of the CI pipeline for the mobile app every time a new change is merged in. Given how long the tests take to finish, we're not able to do this - we don't want to slow down our pipeline that much for every commit!

The nature of tests written using Appium means there is little opportunity to improve this speed - starting the app for each run, setting up test fixtures and clicking through the UI will always take a long time.

We opted to set up a scheduled run that runs the tests in a separate pipeline. We avoid slowing down the mobile app CI pipeline and still get regular feedback that our key workflows still work.

We even get a helpful Slack notification when they've run.

Handling the results 🤔

Once the tests have run, a set of artefacts generated by the pipeline get saved:

Logs for the services that were running
Test results produced using jest-junit.
Videos of each test case

We use Semaphore's test parsing to ingest the test results and display them in an easy-to-read format. We can quickly see what passed or what failed:

We can then use this and the test run videos to determine what we need to fix when a test run fails.

I would(n't) walk 4,200 miles 🌍

Once we had the tests running, we ran into another problem. The VMs were 4,200 miles apart.

We discovered that the macOS VMs were in Wisconsin and the Linux VMs were in Germany. That meant a mere 8,400 mile round trip that took approximately 250ms. This latency broke some of the tests in ways we did not expect.

We added the ability to add latency to the fake backend when running on an engineer's laptop to fix the problems this caused. We could then implement changes to compensate for the latency, increasing the reliability of the tests both locally and in Semaphore.

Closing 👋

Getting our Appium end-to-end tests running in CI was an interesting challenge but worthwhile given the confidence it allows us to have about how our app is working.

We still have more opportunities to automate other user workflows, but we now have a robust set of end-to-end tests that run frequently. We can ship new features often with confidence, knowing we have not impacted our customers' ability to manage their businesses!