Spec-Driven Testing for Mobile apps (Preparing for Open Source Soon)

Finalrun introduces spec-driven, vision-based mobile testing to dramatically reduce flaky UI tests. Open source release coming soon.

TL;DR

Mobile UI tests are brittle. We built a spec-driven QA Agent that:

runs plain-English test specs,

“sees” the app visually (like a human), and

uses code-aware AI skills to auto-generate and maintain robust specs as your code grows.

We’re open-sourcing the agent soon

Notify me when it's open source →

Quick overview of how this works right now.

https://youtu.be/SsVHRDWk_ss

1. The Nightmare of Mobile Testing

If you’ve ever worked on a mobile app, you know the pain of automated UI testing. You write a perfectly good test on Monday. On Tuesday, it fails. You check the app—everything works fine.

This is the dreaded "flaky test."

It’s a test that fails not because your app is broken, but because the test itself is confused. Flaky tests destroy trust in your QA process, slow down releases, and turn automation into a massive headache.

2. Why Are Tests So Brittle?

The root of the problem lies in how traditional mobile tests are built.

Historically, we write tests that act like rigid robots blindly following a map. We tell the code, "Wait exactly 3 seconds, then click the element with the exact hidden ID of 'button_xyz_123'."

But mobile apps are dynamic. What if the network is slow and the page takes 4 seconds to load? What if a developer slightly changes the layout, or renames that hidden ID? The app still works perfectly for a human user, but the "robot" test crashes immediately.

Because of this, test maintenance becomes a full-time job. Teams end up spending more time fixing broken tests than actually building new features.

3. The Solution: Spec-Driven Testing with Finalrun

What if we stopped writing rigid code and started writing tests the way humans actually talk?

Enter Spec-Driven Testing powered by the Finalrun QA Agent.

Instead of writing complex automation scripts, you write a "Specification" (or Spec) in plain, simple text. Your test file literally looks like this:

Tap on "Settings"
Tap on "Language"
Search for "Spanish"
Verify that "Español" appears on the screen.

The Finalrun QA Agent reads these plain-English instructions and interacts with the app visually, exactly like a human would. It doesn't care about hidden code IDs. If a button moves slightly to the left, the Agent still sees it and taps it. By shifting from rigid code to human-readable specs, the tests become incredibly stable and easy to read.

4. The Human Bottleneck

This sounds like the perfect solution, right? But there is a catch.

Even if writing tests is as easy as typing plain English, humans are still prone to error.

If you ask a human to write out the specs for every possible user journey, they will miss things. We forget edge cases. We forget to write the crucial "cleanup" steps (like ensuring the app is logged out before a login test begins). Relying purely on humans to imagine and write every single test spec means your app is still vulnerable to bugs slipping through the cracks.

5. AI "Skills" to the Rescue

To completely solve this, we don't just need an AI that can read specs; we need an AI that can write them.

We achieve this by giving the Finalrun QA Agent specific "Skills." Think of a Skill as a superpower that allows the AI to understand your app from the inside out.

Instead of a QA engineer spending hours manually clicking through the app to write down test steps, the AI uses a basic skill to read your app's actual source code, figure out how the screens connect, and write the test specs for you.

6. How the generate-test-spec Skill Works

Let’s look at a practical example. We equip the agent with a skill called generate-test-spec.

You simply tell the AI: "Create the critical flows for adding the Spanish language to the app."

Using its skill, the AI doesn't just guess. It dives into your codebase, analyzes the user interface components, and maps out the exact paths a user would take. It then automatically generates perfectly formatted, markdown-based Test Specs.

It handles the things humans forget. The AI will automatically write steps to:

Setup: Make sure the app is in the right state before starting.
Execute: The step-by-step taps and text inputs.
Verify: The final checks to ensure the action was successful.

7. Running the Magic

Once the AI has used its skill to generate the plain-text specification file (for example, add_spanish_settings.md), executing it is brilliantly simple.

You don't need to compile complex automation suites. You just open your terminal and type:

./mobile-cli run ./test/add_spanish_settings.md

https://youtu.be/SsVHRDWk_ss

Instantly, the Finalrun agent takes over your mobile simulator. It reads the AI-generated English instructions and physically executes the test on the screen, navigating menus and verifying text just like a real person.

No flaky locators. No endless maintenance. Just clean, readable, AI-driven QA.

🚀 Be the First to Try It: We Are Going Open Source!

We believe that accessible, spec-driven AI is the future of mobile app testing, and we want to share it with the community.

We will be open-sourcing the Finalrun QA Agent very soon.

Want to be the first to know when the repo drops? and join the revolution against flaky tests!
Notify me when it's open source →