Finalrun

Why We Open Sourced Finalrun

Finalrun — Tue, 07 Apr 2026 23:47:15 GMT

https://github.com/final-run/finalrun-agent

We built Finalrun because every mobile test suite we'd worked on had the same lifecycle: someone writes it, it works for a sprint, and then a UI tweak breaks half the selectors. You spend more time babysitting the tests than writing features. We got tired of it and went looking for something fundamentally different — and what we found felt too useful to keep closed.

The Idea That Actually Worked

What if tests could just look at the screen the way a person does?

We started building a vision-based agent that uses screen understanding instead of brittle locators like XPath or accessibility IDs. You describe what you want in plain English — "tap the login button," "scroll to the pricing section," "verify the confirmation message" — and the agent figures out where things are visually.

That part worked surprisingly well. The agent could interpret intent, find elements on screen, and execute actions reliably across both Android and iOS. No element trees. No platform-specific selector syntax. Just natural language and a screenshot.

But the real problem wasn't execution. It was everything that happens before the test runs.

Where Test Flows Actually Break Down

The typical approach is to define test flows outside your codebase. Maybe a QA engineer writes them manually. Maybe they get generated from a PRD. Either way, they live in a separate world from your source code.

This works until it doesn't. The app changes. A screen gets renamed. A flow gets restructured. The tests don't know about any of it. Before long, you're maintaining two sources of truth that are drifting further apart every week.

We tried a second approach: generating tests directly from the codebase using MCP, pulling in component structure, navigation flows, and screen definitions to build test cases that actually reflect the current state of the app. This improved sync significantly — tests knew what the app looked like right now — but the tradeoffs were real. Token usage was high, and generation was slow.

The Shift: Tests Should Live With the Code

The breakthrough wasn't a technical trick. It was a framing change.

Test generation shouldn't be a one-off step that produces artifacts you then manage separately. Tests need to live alongside the codebase so they have continuous access to context and stay in sync as the app evolves. When a developer changes a screen, the tests should know about it — not because someone remembered to update a spec, but because the test definitions are rooted in the same repo.

So we kept the vision-based execution (no selectors, no fragile locators) and moved test generation closer to the repository. The result is a system where tests are generated from codebase context, defined as YAML-based flows, and executed visually.

What This Looks Like in Practice

The workflow breaks down into three pieces:

Generate from context. Tests are produced from actual codebase structure — routes, components, screen definitions — not guesswork or stale documentation.

Define as YAML. Test flows are expressed in a simple, readable YAML format. Easy to version, easy to review in a PR, easy to extend.

Execute with vision. The agent runs tests by looking at the screen, not by querying an element tree. This works across Android and iOS without platform-specific test code.

The most interesting scenario is what we call the post-development handoff. An AI builds a feature inside an IDE. Immediately after, the agent generates a test from the updated codebase and executes it visually — verifying that the feature the AI just wrote actually works on a real device. No human in the loop for the testing step.

Why Open Source

We could have kept this closed and built a product around it. But the problem — flaky, out-of-sync mobile tests — is too widespread, and the approach is too early to develop in a vacuum. Vision-based testing needs to be stress-tested across real codebases, real devices, and real workflows. The fastest way to get there is to let others break it, fix it, and push it further.

The core pieces are all here:

Repo: github.com/final-run/finalrun-agent
Demo: Watch the post-development handoff in action

The demo walks through the full cycle — AI builds a feature, Finalrun generates a vision-based test, and the test runs on device. It's the workflow we wanted when we started down this path: tests that keep up with the code, execute like a human would, and don't fall apart the moment someone moves a button three pixels to the left.

If you've been fighting flaky mobile test suites, we'd love to hear what you've tried. Open an issue on the repo or reach out — this is very much a work in progress.

🏆 FinalRun achieves the New #1 SOTA Mobile Agent Eclipsing Benchmarks at 97.4%

Finalrun — Wed, 18 Mar 2026 04:00:56 GMT

We are thrilled to announce FinalRun, a revolutionary Android UI agent that has officially secured the #1 spot on the leaderboard with an unprecedented 97.4% Success Rate. Not only does FinalRun achieve the highest accuracy ever recorded, but it is fundamentally engineered to be faster and significantly cheaper than any other agent currently on the benchmark list.

By rethinking how AI agents perceive UI hierarchies, handle real-time states, and process errors, FinalRun sets a new gold standard for autonomous mobile execution.

The Race for the Ultimate UI Agent

Building an AI agent that can reliably navigate the dynamic, asynchronous environment of a mobile operating system is notoriously difficult. Traditional agents stumble on UI animations, hallucinate successful actions, and incur massive inference costs due to bloated observation spaces.

With FinalRun, we completely reimagined the perception-action loop. We focused not just on reasoning capability, but on execution reliability and cost efficiency. The result? A staggering 97.4% task completion rate that operates faster and cheaper than the previous state-of-the-art.

Here are the four core breakthroughs that power FinalRun:

1. ⚡ Optimized Hierarchy

The biggest bottleneck for mobile agents is the massive context window required to read Android UI trees (XML). FinalRun utilizes a highly Optimized UI Hierarchy, aggressively pruning irrelevant nodes.

Cheaper & Smarter: By pairing this lightweight hierarchy with Gemini Flash and Flash Lite, we drastically reduced token consumption. The cost per task is a fraction of what competing agents require.
Blazing Fast: Powered by Gemini Flash and Gemini Flash light
Image Stabilization Check: Speed means nothing if the agent clicks while the screen is still loading. FinalRun introduces a deterministic stabilization check—requiring at least 3 consecutive stable image frames—before concluding the UI has settled. This entirely eliminates misclicks caused by mid-animation transitions.

2. 📝 Native Text Editing Mastery

Manipulating text fields is a common failure point for MLLMs, which often try to backspace infinitely or struggle with cursor placement. FinalRun features a dedicated Text Editing Agent that interacts with the OS the same way a human power-user does. When specific text needs to be deleted, the agent executes a Long Press to trigger the Android system’s native context menu, utilizing default options to "Select All" and "Delete". This native approach guarantees 100% precision in text manipulation, avoiding the erratic behavior seen in previous SOTA agents.

3. 🔄 Closed-Loop Failure Feedback

A major flaw in traditional UI agents is the "blind assumption of success"—the agent issues a click command and blindly assumes the task moved forward. FinalRun introduces a strict Failure Feedback Loop. If a grounding action (e.g., attempting to tap a coordinate or element) fails at the system level, this failure is immediately fed back into the agent's context. The agent knows exactly why the action failed, preventing it from hallucinating task completion and allowing it to instantly recalculate a new path to success.

4. 📸 Just-In-Time (JIT) Grounding

Mobile UIs are highly volatile; pop-ups, notifications, or network-delayed renders can change the screen state in milliseconds. Traditional agents evaluate a screenshot, spend seconds "thinking," and then execute a click on a screen that may have already changed. FinalRun solves this race condition by capturing a fresh screenshot immediately before performing any grounding action. This ensures that all coordinates and actions are executed against the absolute latest UI state. This Just-In-Time perception captures errors faster, prevents catastrophic misclicks, and ensures a truly real-time interaction loop.

The Future of Mobile Automation is Here

Achieving 97.4% on complex mobile benchmarks proves that reliable, production-ready mobile AI is no longer a theoretical concept. By combining the raw speed of Gemini Flash, an optimized UI hierarchy, and robust, human-like execution safeguards, FinalRun is unequivocally the fastest, cheapest, and most capable Android agent in the world.

Welcome to the new SOTA.

We’re open-sourcing the agent soon

Notify me when it's open source →

Spec-Driven Testing for Mobile apps (Preparing for Open Source Soon)

Finalrun — Mon, 16 Mar 2026 06:07:53 GMT

Finalrun introduces spec-driven, vision-based mobile testing to dramatically reduce flaky UI tests. Open source release coming soon.

TL;DR

Mobile UI tests are brittle. We built a spec-driven QA Agent that:

runs plain-English test specs,

“sees” the app visually (like a human), and

uses code-aware AI skills to auto-generate and maintain robust specs as your code grows.

We’re open-sourcing the agent soon

Notify me when it's open source →

Quick overview of how this works right now.

https://youtu.be/SsVHRDWk_ss

1. The Nightmare of Mobile Testing

If you’ve ever worked on a mobile app, you know the pain of automated UI testing. You write a perfectly good test on Monday. On Tuesday, it fails. You check the app—everything works fine.

This is the dreaded "flaky test."

It’s a test that fails not because your app is broken, but because the test itself is confused. Flaky tests destroy trust in your QA process, slow down releases, and turn automation into a massive headache.

2. Why Are Tests So Brittle?

The root of the problem lies in how traditional mobile tests are built.

Historically, we write tests that act like rigid robots blindly following a map. We tell the code, "Wait exactly 3 seconds, then click the element with the exact hidden ID of 'button_xyz_123'."

But mobile apps are dynamic. What if the network is slow and the page takes 4 seconds to load? What if a developer slightly changes the layout, or renames that hidden ID? The app still works perfectly for a human user, but the "robot" test crashes immediately.

Because of this, test maintenance becomes a full-time job. Teams end up spending more time fixing broken tests than actually building new features.

3. The Solution: Spec-Driven Testing with Finalrun

What if we stopped writing rigid code and started writing tests the way humans actually talk?

Enter Spec-Driven Testing powered by the Finalrun QA Agent.

Instead of writing complex automation scripts, you write a "Specification" (or Spec) in plain, simple text. Your test file literally looks like this:

Tap on "Settings"
Tap on "Language"
Search for "Spanish"
Verify that "Español" appears on the screen.

The Finalrun QA Agent reads these plain-English instructions and interacts with the app visually, exactly like a human would. It doesn't care about hidden code IDs. If a button moves slightly to the left, the Agent still sees it and taps it. By shifting from rigid code to human-readable specs, the tests become incredibly stable and easy to read.

4. The Human Bottleneck

This sounds like the perfect solution, right? But there is a catch.

Even if writing tests is as easy as typing plain English, humans are still prone to error.

If you ask a human to write out the specs for every possible user journey, they will miss things. We forget edge cases. We forget to write the crucial "cleanup" steps (like ensuring the app is logged out before a login test begins). Relying purely on humans to imagine and write every single test spec means your app is still vulnerable to bugs slipping through the cracks.

5. AI "Skills" to the Rescue

To completely solve this, we don't just need an AI that can read specs; we need an AI that can write them.

We achieve this by giving the Finalrun QA Agent specific "Skills." Think of a Skill as a superpower that allows the AI to understand your app from the inside out.

Instead of a QA engineer spending hours manually clicking through the app to write down test steps, the AI uses a basic skill to read your app's actual source code, figure out how the screens connect, and write the test specs for you.

6. How the generate-test-spec Skill Works

Let’s look at a practical example. We equip the agent with a skill called generate-test-spec.

You simply tell the AI: "Create the critical flows for adding the Spanish language to the app."

Using its skill, the AI doesn't just guess. It dives into your codebase, analyzes the user interface components, and maps out the exact paths a user would take. It then automatically generates perfectly formatted, markdown-based Test Specs.

It handles the things humans forget. The AI will automatically write steps to:

Setup: Make sure the app is in the right state before starting.
Execute: The step-by-step taps and text inputs.
Verify: The final checks to ensure the action was successful.

7. Running the Magic

Once the AI has used its skill to generate the plain-text specification file (for example, add_spanish_settings.md), executing it is brilliantly simple.

You don't need to compile complex automation suites. You just open your terminal and type:

./mobile-cli run ./test/add_spanish_settings.md

https://youtu.be/SsVHRDWk_ss

Instantly, the Finalrun agent takes over your mobile simulator. It reads the AI-generated English instructions and physically executes the test on the screen, navigating menus and verifying text just like a real person.

No flaky locators. No endless maintenance. Just clean, readable, AI-driven QA.

🚀 Be the First to Try It: We Are Going Open Source!

We believe that accessible, spec-driven AI is the future of mobile app testing, and we want to share it with the community.

We will be open-sourcing the Finalrun QA Agent very soon.

Want to be the first to know when the repo drops? and join the revolution against flaky tests!
Notify me when it's open source →

Why Mobile End-to-End Testing Fails: It's Not the Team's Fault. Its the tools

Finalrun — Mon, 18 Aug 2025 13:38:14 GMT

You've been there. A new version of your mobile app is ready to ship. The features are built, but a nagging uncertainty remains: "Did we break something?"

End-to-end (E2E) testing should provide this confidence, but it rarely does. More often, it's a source of frustration. Tests are flaky, slow, and live in a separate world owned by a handful of specialists. The result? Teams either ship with anxiety or delay releases, caught in a cycle of manual regression testing.

The problem isn't the team's desire for quality. It's that traditional E2E testing is fundamentally broken. Here are the real reasons it fails, and how we're working to change that.

1. The Time & Deadline Trap

This is the number one killer of quality. Stakeholders want to "ship fast," and robust E2E testing is perceived as the slowest part of the process.

The Perception: E2E tests are brittle scripts that take a long time to write and even longer to run. Faced with a tight deadline, the team makes a difficult choice: cut down the test suite or risk delaying the launch.
The Reality: This isn't laziness; it's a system failure. The immediate pressure of the release schedule will always win against a testing process that delivers its verdict hours—or even days—later. The feedback loop is simply too long to be practical.

2. The Friction of Mobile Testing

Even with time and resources, the sheer complexity of mobile E2E testing creates a massive barrier.

The Setup Cost: Meaningful mobile testing requires managing device farms, simulators/emulators, different OS versions, and complex app states. This is specialized knowledge, and the effort required is so high that most developers are locked out of the process entirely.
Brittle Scripts, Not Real Flows: Traditional E2E tests are often imperative scripts that break with the smallest UI change. They test implementation details, not the actual user journey. This means they require constant, painful maintenance.

3. The "Quality Gatekeeper" Silo

Many organizations isolate E2E testing within a dedicated QA team. While well-intentioned, this creates a dangerous disconnect.

Delegated Responsibility: When testing is "someone else's job," developers and product managers become disconnected from the user experience. Quality becomes a gate they must pass through, not a shared objective. The QA team becomes a bottleneck, manually testing flows and reporting bugs back to developers, slowing everything down.
Lack of Visibility: The outcome of tests often lives in a separate dashboard, away from where the rest of the team works. PMs can't easily see if a critical user flow is working, and developers don't get immediate, actionable feedback in their workflow.

How Finalrun Realigns the Team on Quality

The problem isn't the people; it's the process and the tools. Finalrun was built to make end-to-end mobile testing a fast, collaborative, and integrated part of the development lifecycle, empowering the entire team.

We solve the core issues head-on:

1. Making End-to-End Mobile Testing Effortless: Finalrun removes the technical complexity. Instead of writing brittle code, your team can describe any user flow in a simple, intuitive way. Our platform handles the rest—running these flows on real devices and providing clear results. By integrating seamlessly into your CI/CD pipeline, testing becomes an automated, reliable part of every single build.

2. Creating a Faster Feedback Loop for Everyone: Finalrun doesn't just run tests; it delivers insights, fast. When a test fails, it's not just a red checkmark in a forgotten dashboard. Your team gets detailed reports, videos, and logs directly in their existing workflow. This allows developers to see exactly what went wrong and fix it immediately, transforming the feedback loop from days to minutes.

3. A Shared Platform for Developers, QA, PMs, and Managers: Quality is a team sport. Finalrun is the platform where everyone can collaborate.

For Developers: Get immediate, reliable feedback on your code's impact on critical user flows without ever leaving your workflow.
For QAs: Move from manual execution to quality strategy. Define, manage, and monitor all critical user journeys in one place, freeing you up to focus on exploratory testing and edge cases.
For PMs & Engineering Managers: Gain unprecedented visibility. Confidently know the status of every core feature and user flow before a release. Make data-driven decisions and ensure the product you planned is the product you ship.

Let's stop accepting that mobile E2E testing has to be slow and painful. It's time for a tool that makes shipping with confidence the fastest and easiest way to work.

If you want to know how we are achieving 99% accuracy UI automation with Finalrun. Read the following articles:

How We Set Out to Solve the XPath Problem in Mobile UI Test Automation

The future of UI Element Targetting: Finalrun Identifiers beats Xpath

Why LLMs Like ChatGPT, Gemini, and Claude Understand FinalRun Identifiers Better Than XPath

📅 Book a Demo
See how FinalRun fits into your existing workflow with a live Demo

Mobile End-to-End Testing Tools in 2025

Finalrun — Mon, 18 Aug 2025 13:04:11 GMT

In the complex world of mobile development, a passing unit test provides a sigh of relief, but it doesn't guarantee a flawless user experience. Modern applications are intricate ecosystems of UI elements, backend services, third-party integrations, and platform-specific behaviors. End-to-end (E2E) testing is the only methodology that validates the entire user journey, from login to logout, ensuring every component works in concert.

However, "doing E2E testing" is not as simple as picking a popular tool. The choice of framework is a critical decision that impacts your team's workflow, your application's architecture, and your long-term automation strategy. This guide provides a detailed technical breakdown of the leading mobile E2E testing frameworks, exploring their core architecture, strengths, trade-offs, and ideal use cases.

The Foundational Choice: Gray-Box vs. Black-Box Testing

Mobile testing frameworks fundamentally fall into two architectural categories: gray-box and black-box. Understanding this distinction is the first step to choosing the right tool.

Gray-Box Frameworks (Espresso, XCUITest, Detox): These tools operate from within the application. The test code is bundled with the app in a special build, allowing the test and the app to share the same memory and threads. This intimate connection gives the framework programmatic access to the app's internal state. It can directly check if a background process is running or if a UI element has finished rendering without just guessing from the visual output. This "in-process" nature is their greatest strength, leading to incredibly fast and reliable tests because they can synchronize with the app's UI thread, eliminating the flakiness that plagues mobile automation. The trade-off is a lack of worldview; they are confined to their own app and cannot easily interact with other apps or system-level dialogs.
Black-Box Frameworks (Appium, Maestro): These tools operate from outside the application, interacting with it just as a user would: by tapping, swiping, and reading what's on the screen. They have no access to the application's internal code or state. Their strength lies in their ability to test true, real-world user workflows that may involve system-level interactions (like push notifications and permissions) or even traversing across multiple applications. The challenge with this approach is a higher potential for flakiness; the tests rely on timing and visual cues, which can be inconsistent across different devices and network conditions.

The AI-Powered Approach: FinalRun

FinalRun is a modern test automation platform that is revolutionizing the E2E space by leveraging AI to understand plain English. It is designed to solve the two biggest challenges in test automation: the high technical skill required to write tests and the constant maintenance of brittle test scripts.

How it Works: FinalRun allows you to create complex test cases without writing a single line of code. Testers can simply write commands in plain English, such as "Login with phone number 9088989878", and FinalRun's AI interprets these commands and executes the corresponding actions on the device. Alternatively, users can interact directly with a mirrored version of their app on their screen—tapping buttons and entering text as a user would—and FinalRun will automatically translate these interactions into durable test steps.
Strengths:
- Unmatched Simplicity: The ability to use plain English democratizes test creation. Manual testers, business analysts, and even product managers can create and run E2E tests, freeing up developer resources.
- Solving the XPath Problem: Traditional automation relies on fragile selectors like XPaths or element IDs. When a developer changes an element's ID, the test breaks. FinalRun's AI operates on intent. It understands that "Click on Continue" means finding the element that logically represents "Continue," whether it's a button with that text or an icon. This makes tests far more resilient to UI changes.
- Extreme Speed of Creation: Writing a sentence or simply using the app is significantly faster than writing, debugging, and maintaining complex code. This drastically accelerates the testing cycle.
Trade-offs:
- As a platform that relies on AI and a cloud interface, it may not be suitable for teams that require their test code to be stored and versioned in a specific, self-managed repository in the same way as traditional code-based frameworks.
Best For: Teams of all sizes that want to drastically speed up their testing process, reduce the burden of test maintenance, and empower their entire quality team—not just engineers—to contribute to automation. It is particularly powerful for organizations looking to move away from the fragility of traditional, code-heavy test automation.

📅 Book a Demo
See how FinalRun fits into your existing workflow with a live Demo

The Cross-Platform Champions

For teams supporting both Android and iOS, a cross-platform framework is essential for efficiency.

Appium: The Veteran with Unmatched Flexibility

Appium is the open-source standard for mobile automation. It operates on a client-server architecture, wrapping native automation frameworks like Espresso and XCUITest into a single, consistent API based on the W3C WebDriver protocol.

How it Works: Your test script (written in virtually any language like Java, Python, or JavaScript) acts as a client, sending JSON commands to the Appium server. The server translates these commands and executes them on the connected device or emulator using the underlying native framework.
Strengths:
- True Cross-Platform: It supports native, hybrid, and mobile web apps across Android and iOS with a single codebase.
- Language Agnostic: Your team can write tests in the language they are most comfortable with.
- Massive Ecosystem: It boasts a huge community and a vast library of plugins for almost any integration imaginable.
Trade-offs:
- Complexity and Overhead: The client-server architecture introduces latency, making tests slower than their gray-box counterparts. Setup can also be complex.
- Fragility: As a black-box tool, it can be prone to timing issues, requiring careful implementation of "waits" to ensure stability.
Best For: Teams that need to test across both iOS and Android, especially for hybrid apps or complex flows that interact with the system UI. Its flexibility makes it a solid choice for large-scale, enterprise-level test automation.

Maestro: The Challenger with a Low-Code Revolution

Born in 2022, Maestro is a modern, lightweight framework that tackles mobile automation from a different angle.

How it Works: Maestro uses a simple, declarative YAML syntax to define tests. Instead of writing procedural code, you describe the state you want to achieve.

  # A simple Maestro test
  appId: com.my.app
  ---
  - launchApp
  - tapOn: "Login"
  - inputText: "user@example.com"
    into:
      id: "email_field"
  - inputText: "password123"
    into:
      id: "password_field"
  - tapOn: "Submit"
  - assertVisible: "Welcome, User!"

Strengths:
- Simplicity and Speed: The YAML syntax is incredibly easy to learn, lowering the barrier to entry for the entire team, including manual QAs and product managers.
- Resilience: Maestro has built-in intelligence to handle common flakiness issues, automatically waiting for elements and dealing with unexpected pop-ups.
- Maestro Studio: A feature that allows you to click through your app and have the YAML test script generated automatically, drastically speeding up test creation.
Trade-offs:
- Less Flexibility: The declarative nature means you have less granular control for highly complex or unconventional test logic compared to a full programming language.
- Younger Ecosystem: As a newer tool, its community and integration options are still growing compared to a giant like Appium.
Best For: Teams that want to get up and running with E2E testing quickly without a steep learning curve. It's ideal for startups and teams that prioritize speed of development and ease of maintenance.

The Native Powerhouses: Speed and Stability

When your focus is on a single platform and performance is paramount, native frameworks are unrivaled.

Espresso (Android) & XCUITest (iOS)

These are the official, first-party UI testing frameworks from Google and Apple, respectively.

How it Works: As gray-box tools, they run in the same process as the application. Their key feature is the ability to automatically synchronize with the UI thread. This means they will intrinsically wait for the UI to become idle before performing the next action, which all but eliminates timing-related test failures.
Strengths:
- Speed and Reliability: In-process execution makes them the fastest and most stable option for UI testing.
- Deep Integration: They have access to all of the application's UI elements and internal state, allowing for powerful and precise tests. They are the standard for App Store and Play Store compliance testing.
Trade-offs:
- Platform Lock-in: Tests written for Espresso cannot be run on iOS, and vice-versa.
- Steep Learning Curve: They require native development knowledge (Kotlin/Java for Espresso, Swift for XCUITest) and familiarity with Android Studio or Xcode.
Best For: Development teams that are responsible for their own testing. If you have dedicated Android and iOS developers, empowering them to use the native tools will yield the most robust in-app test suites.

The Specialist Tools

Some frameworks are purpose-built for specific development ecosystems.

Detox: The Go-To for React Native

Detox is a gray-box framework designed from the ground up for React Native applications.

How it Works: Detox's "secret sauce" is its synchronization mechanism. It monitors React Native's asynchronous operations (like network requests and animations) and waits for them to complete before proceeding, leading to highly stable tests.
Strengths:
- Built for React Native: It understands the React Native architecture, making it faster and more reliable than black-box alternatives for this environment.
Trade-offs:
- Niche Focus: It is not intended for use outside of the React Native ecosystem.
- Configuration: Can be complex to set up, often requiring native build hooks.
Best For: Teams that are fully committed to the React Native stack.

Making the Right Decision: A Summary

Framework	Testing Paradigm	Key Strength	Main Trade-off	Best For
FinalRun	Black-Box (AI)	Plain English test creation, resilient to UI changes	Platform-based, less suited for self-managed code	Teams wanting to accelerate testing and empower non-engineers.
Appium	Black-Box	Ultimate flexibility (cross-platform, language-agnostic)	Slower execution, potential for flakiness	Large teams needing to test hybrid/native apps on both platforms.
Maestro	Black-Box	Simplicity and speed of test creation	Less flexible for complex logic	Teams wanting a low-code, easy-to-maintain solution.
Espresso	Gray-Box	Highest speed and reliability on Android	Android-only, requires Kotlin/Java skills	Dedicated Android development teams.
XCUITest	Gray-Box	Highest speed and reliability on iOS	iOS-only, requires Swift skills	Dedicated iOS development teams.
Detox	Gray-Box	Fast and stable tests for React Native	Niche focus, complex setup	Teams building exclusively with React Native.

Choosing your E2E testing tool is a long-term commitment. By understanding the fundamental differences in their architecture and aligning their strengths with your team's skills and your app's technology stack, you can build a robust automation strategy that ensures a seamless experience for every user, every time.

If you want to know how we are achieving 99% accuracy UI automation with Finalrun. Read the following articles:

How We Set Out to Solve the XPath Problem in Mobile UI Test Automation

The future of UI Element Targetting: Finalrun Identifiers beats Xpath

Why LLMs Like ChatGPT, Gemini, and Claude Understand FinalRun Identifiers Better Than XPath

📅 Book a Demo
See how FinalRun fits into your existing workflow with a live Demo

End-to-End Testing for Mobile: Beginners Guide

Finalrun — Tue, 05 Aug 2025 07:27:31 GMT

End-to-end (E2E) testing is a software testing methodology that involves testing an application's workflow from beginning to end. It simulates real user scenarios to validate the system and its components for integration and data integrity. This guide will delve into the specifics of E2E testing for mobile applications, exploring a user flow from login to checkout, multi-app testing, and data validation using APIs and databases. We'll also touch upon the role of CI/CD in accelerating the testing and development lifecycle.

Imagine a retail mobile application. A typical user journey, or user flow, could be:

Login: The user launches the app and logs in with their credentials. The E2E test would verify that the app successfully authenticates the user and grants access to their account.
Search and Select: The user searches for a product, applies filters, and selects an item. The test would ensure that the search and filter functionalities work correctly and that the product details are displayed accurately.
Add to Cart: The user adds the selected item to their shopping cart. Here, the test validates that the cart is updated with the correct product and quantity.
Checkout: The user proceeds to checkout, enters their shipping and payment information, and confirms the order. The test would verify that the order is placed successfully and that the user receives a confirmation.

Throughout this flow, the E2E test simulates user interactions like tapping buttons, entering text, and swiping through screens to ensure a seamless experience.

Multiple App Testing in the Same Flow

In many modern service ecosystems, a single user transaction can span across multiple applications. For instance, consider a food delivery service which might involve a consumer app, a Point of Sale (POS) app at the restaurant, and a delivery app.

An E2E test for such a scenario would look like this:

A user places an order on the consumer app.
The E2E test would then need to verify that the order is received and displayed correctly on the restaurant's POS app.
Once the restaurant accepts the order, the test would check if a notification is sent to the delivery app, assigning a driver for the delivery.
The test can further track the order status on the consumer app as it changes from "Order Placed" to "In Transit" and finally "Delivered".

This type of testing is crucial to ensure that the entire ecosystem of apps works in harmony to deliver a smooth customer experience.

App with API or Database to Validate Data

To ensure data integrity across systems, E2E tests can be augmented with API and database validations. After a user completes a checkout on the consumer app, an automated test can make an API call to the backend server to verify that the order details (e.g., product ID, quantity, price, shipping address) are correctly stored in the database.

This approach combines the "black-box" testing of the UI with "white-box" testing of the backend, providing a more comprehensive validation of the system. It helps in identifying issues that might not be visible on the user interface, such as incorrect data storage or processing errors.

CI/CD for Faster Iteration

Continuous Integration and Continuous Deployment (CI/CD) is a practice that automates the software development and release process. Integrating E2E tests into the CI/CD pipeline offers significant advantages:

Early Bug Detection: E2E tests can be configured to run automatically every time new code is committed. This helps in catching bugs early in the development cycle, reducing the cost and effort required for fixing them.
Faster Feedback Loop: Developers get quick feedback on their changes, allowing them to iterate faster and improve the quality of the code.
Increased Confidence in Releases: With a robust suite of automated E2E tests, the development team can have higher confidence in the stability and quality of the application, leading to smoother and more frequent releases.

By automating the entire process from code commit to testing and deployment, CI/CD empowers teams to deliver high-quality mobile applications at a faster pace.

In conclusion, a well-defined E2E testing strategy is indispensable for developing high-quality mobile applications. By simulating real user scenarios, testing across multiple apps, validating data through APIs and databases, and integrating tests into a CI/CD pipeline, development teams can ensure a seamless user experience, maintain data integrity, and accelerate their release cycles.

The Random Popup Problem: Why Your Mobile Tests Are Flaky

Finalrun — Tue, 29 Jul 2025 05:05:21 GMT

You've been there. You kick off your mobile test suite, confident that your carefully crafted steps will glide through the app. Then, a test fails. You dig into the logs, scroll through screenshots, and finally, there it is: a completely random popup – a location permission request, an app update notification, or maybe even a "rate us" prompt – that appeared out of nowhere and completely derailed your test.

Frustrating, right? It's like bringing a perfectly tuned race car to the track, only for a tumbleweed to roll across and stop the race. These "random popups" are the silent saboteurs of mobile test automation, causing flaky tests, wasted debugging hours, and eroding trust in your automation efforts.

The Problem: When Your Script Can't See the Unexpected

Traditional mobile test automation tools rely heavily on pre-defined scripts. You tell them, "Tap this button," "Enter text here," "Verify that element." This works great when the app behaves exactly as expected.

But mobile apps are dynamic. Developers want to ask for permissions, inform users about new features, or nudge them for reviews. These intentions often translate into popups that:

Block critical UI elements: Your test is trying to tap "Login," but a "Allow Location Access?" dialog is directly on top of it.
Change the screen state: The script expects a certain screen, but a popup appears, changing the element IDs or preventing interaction.
Are unpredictable: They might appear on the first run, but not the tenth, making tests frustratingly flaky.

Imagine trying to follow a recipe, but every few minutes someone throws a random ingredient at you and you have to pause, pick it up, and figure out if it belongs. That's what traditional automation tools feel like in the face of random popups.

Your traditional test automation tool trying to focus on the login button while a location popup demands attention.

The result? Tests fail not because of a bug in your app, but because your automation couldn't adapt to an unexpected, yet often legitimate, interruption. This leads to:

Increased debugging time: You spend hours trying to understand why a test failed, only to find it was a dismissible popup.
Flaky test suites: Tests pass some times and fail others, making your CI/CD pipeline unreliable.
Reduced confidence: Teams start losing faith in the automation, leading to more manual testing.

The FinalRun Solution: AI-Powered Context Awareness

At FinalRun, we believe mobile tests should just work. They should be resilient, intelligent, and understand the context of the application, not just blindly follow a script. This is where our core AI capability comes into play.

Unlike traditional tools that simply follow a rigid script, FinalRun's AI is context-aware. When you define a test step – let's say, "Log in with phone number 9088989878" – the AI doesn't just look for the immediate login fields. It has a deeper understanding of the app's state and its ultimate goal.

Watch our quick demo of FinalRun's AI intelligently handling a location permission popup to ensure the test continues smoothly:

https://youtu.be/TETpYTd3A6E

Here's how it works at its core:

Intent-Driven Automation: You tell FinalRun what you want to achieve (e.g., "login," "add to cart," "checkout").
Continuous Observation: As the test runs, FinalRun's AI continuously observes the mobile device's screen.
Intelligent Blocker Detection: If the AI detects that the screen is not in the state expected for the current test step, and an element is blocking the intended interaction (like a popup), it doesn't just give up. It intelligently identifies these blockers.
Proactive Dismissal: The AI then determines the best way to dismiss this blocker. This could involve:
- Tapping "Don't Allow" or "Deny" for permission requests.
- Tapping "Skip," "Later," or "X" for update notifications or "Rate Us" prompts.
- Even navigating through onboarding screens if they unexpectedly reappear.

The key is this: FinalRun's AI will persist in dismissing any identified blockers until your original, intended test step can be successfully executed. It's like having a smart co-pilot that clears the path for your main mission, no matter what detours pop up.

What Does This Mean For You?

This intelligent handling of dynamic UI elements translates into significant benefits for your testing process:

Unprecedented Test Stability: Say goodbye to flaky tests caused by popups. Your tests become more reliable and consistent, giving you accurate feedback on your app's true quality.
Massive Time Savings: No more wasted hours debugging trivial popup issues. Your QA and development teams can focus on finding real bugs and building great features.
Faster Release Cycles: With more stable tests and less debugging, your CI/CD pipeline runs smoother, accelerating your time to market.
Comprehensive Test Coverage: You can automate more scenarios with confidence, knowing that unexpected UI elements won't block your progress.
Empowered Teams: Developers and QA engineers spend less time fighting with automation and more time innovating.

Get Started with Truly Resilient Mobile Automation

Stop letting random popups dictate the reliability of your mobile tests. FinalRun's AI is designed to understand, adapt, and intelligently clear the path for your automation, ensuring your tests just work.

Ready to experience the future of intelligent mobile testing?

Learn more about FinalRun's AI-powered automation and try it for free!

If you want to know how we are achieving 99% accuracy UI automation with Finalrun. Read the following articles:

How We Set Out to Solve the XPath Problem in Mobile UI Test Automation

The future of UI Element Targetting: Finalrun Identifiers beats Xpath

Why LLMs Like ChatGPT, Gemini, and Claude Understand FinalRun Identifiers Better Than XPath

📅 Book a Demo
See how FinalRun fits into your existing workflow with a live Demo.

Why LLMs Like ChatGPT, Gemini, and Claude Understand FinalRun Identifiers Better Than XPath

Finalrun — Fri, 18 Jul 2025 12:53:41 GMT

The software world is buzzing with the rise of Large Language Models (LLMs) like ChatGPT, Gemini, and Claude. These tools promise to generate code, automate complex workflows, and even write full-fledged test scripts, all from simple natural language instructions.

One of the most exciting frontiers for this technology is mobile test automation: telling an AI, “Tap the login button,” and letting it handle the rest.

But there's a problem, a very old one. It's called XPath.

While LLMs are capable of generating Appium test code, they often stumble when it comes to building reliable, accurate XPath locators. It’s not a flaw in the models, it's a mismatch between how humans communicate and how XPath works.

That’s where FinalRun comes in. With a structured, human-readable format for identifying UI elements, FinalRun bridges the gap between intent and automation, unlocking a level of AI accuracy and resilience that XPath was never built to support.

The Core Problem: Why LLMs Struggle With XPath

To understand why FinalRun changes the game, let’s first look at what’s broken with XPath.

1. Brittleness and Strict Hierarchies

Even when LLMs are given screenshots, XPath still relies on rigid UI structure, not what’s visually rendered. A small layout change, like wrapping a button in a new container, can break the XPath even though nothing looks different on screen. LLMs can understand what the user wants, but XPath forces them to guess how the UI is built behind the scenes, which often leads to fragile and unreliable locators.

2. No Spatial or Relational Awareness

Consider this instruction:

“Click the trash icon to the right of ‘My Document.’”

XPath has no built-in way to understand "right of". The best it can offer is something like following-sibling::, which only checks if one node follows another in code, not on the screen. In responsive or dynamic UIs, that’s a dangerous assumption.

3. Cryptic and Unintuitive Syntax

XPath is a dense query language that even seasoned engineers struggle to write correctly. Now imagine asking an LLM to produce this:

//android.widget.LinearLayout[.//android.widget.TextView[@text='Submit'] and .//android.widget.Button]

One syntax error, one wrong assumption about the UI, and the test fails. XPath is fragile, hard to maintain, and deeply unfriendly to generative AI.

FinalRun Identifiers: Built for Humans and AI

FinalRun replaces XPath with declarative, structured identifiers written in JSON. Instead of guessing a path through the DOM, you describe the element by what it is and how it relates to other elements.

Here’s a simple example:

{
  "text": "Submit",
  "insideOf": { "id": "footer" }
}

It’s readable. Logical. And perfectly aligned with how LLMs think.

But FinalRun goes even further—with support for relational and spatial logic.

Supported properties:

text, id, type, accText (accessibility label)

Supported relationships:

insideOf, containsDescendants

Supported spatial logic:

rightOf, leftOf, topOf, bottomOf

This vocabulary allows an LLM to translate natural instructions directly into automation-ready identifiers—with no guesswork, and no brittle hierarchy dependencies.

Checkout the documentation on element identifiers

📅 Book a Demo
See how FinalRun fits into your existing workflow with a live Demo.

Real-World Scenarios: FinalRun vs XPath in LLM Generation

🔍 Scenario 1: Spatial Relationship

Prompt: “Find the settings icon that is to the right of the ‘Profile’ label.”

❌ XPath:

//android.widget.TextView[@text='Profile']/following-sibling::android.widget.ImageView

Only works if both elements are siblings in the XML structure.
Breaks easily if layout changes.
Doesn’t guarantee visual “right of.”

✅ FinalRun:

{
  "type": "icon",
  "rightOf": { "text": "Profile" }
}

Mirrors user intent exactly.
Leverages actual screen coordinates, not code structure.
Simple and reliable for both humans and LLMs.

📦 Scenario 2: Containment Logic

Prompt: “Target the product card that contains both the text ‘Organic Bananas’ and a button with the text ‘Add to Cart.’”

❌ XPath:

//android.view.ViewGroup[.//android.widget.TextView[@text='Organic Bananas'] and .//android.widget.Button[@text='Add to Cart']]

Dense and unreadable.
Highly fragile—small DOM changes can break it.
Hard for LLMs to generate consistently.

✅ FinalRun:

{
  "containsDescendants": [
    { "text": "Organic Bananas" },
    { "text": "Add to Cart" }
  ]
}

Clean, expressive, and self-documenting.
Perfect 1:1 mapping with the user’s natural instruction.
LLMs excel at generating structured data like this.

Why FinalRun Works So Well with LLMs

Semantic and Not Structural

LLMs work by understanding meaning, not memorizing code patterns. FinalRun identifiers describe what something is and how it relates to other things—not where it lives in a brittle hierarchy.

Uses JSON LLMs' Native Language

LLMs are trained extensively on JSON, API specs, config files, and structured logs. JSON is predictable, easy to generate, and easy to validate.

Aligns With Human Thought

Users think in terms of relationships and meaning:

“The red button in the header,” not “/html/body/div[1]/button[3]”.

FinalRun gives LLMs a vocabulary to express this intent directly.

Enables Smart, Self-Healing Automation

Because FinalRun identifiers are expressive and spatially aware, LLMs (or test engines) can use alternate paths when one fails—making automation more resilient and adaptive.

Conclusion: A New Foundation for AI-Driven Testing

The reason LLMs perform better with FinalRun isn’t about better AI—it’s about better design.

FinalRun identifiers are structured, expressive, and resilient. They speak the same language LLMs were trained on. XPath, in contrast, is a relic of a time before AI—built for rigid DOM traversal, not natural understanding.

If you’re building the future of test automation—where tests are written in English and powered by AI—then FinalRun identifiers are the bridge between intent and execution.

Read our story behind why we took this new approach to element identification:

How We Set Out to Solve the XPath Problem in Mobile UI Test Automation

The future of UI Element Targetting: Finalrun Identifiers beats Xpath

📅 Book a Demo
See how FinalRun fits into your existing workflow with a live Demo.

The Future of UI Element Targeting: Why FinalRun Beats XPath

Finalrun — Fri, 18 Jul 2025 09:32:16 GMT

Overview

Robust, maintainable UI automation begins with precise element identification. This documentation introduces FinalRun’s JSON-based identifier framework and explores its advantages compared to traditional XPath-based selectors commonly used with Appium. Emphasizing relational identifiers, it guides automation engineers to build resilient, clear, and scalable tests for dynamic interfaces.

1. Introduction to Element Identification

In automated testing, the method used to find elements determines the reliability and maintainability of test suites. Two dominant approaches are:

XPath Selectors (Appium): String-based queries that traverse the UI hierarchy.
FinalRun Identifiers: Declarative, JSON-based objects combining attributes and relationships.

2. XPath with Appium: Structure-Based Selection

XPath is a query language for targeting nodes in an XML document, mapped to UI element structures in Appium.

Strengths

Universal; works across many platforms.
Can express complex UI hierarchies.

Limitations

Fragile: Breaks with minor UI changes.
Difficult Maintenance: Often requires updates after UI refactoring.
Cryptic Syntax: Not intuitive; error-prone string manipulation.
Performance: Deep or complex queries can slow tests.

3. FinalRun Identifiers: Attribute and Relationship-Based Targeting

FinalRun introduces a modern, JSON-driven approach, focusing on both the properties and relationships of UI elements.

Properties

id: Unique resource name.
text: Displayed text.
type: Element type (e.g., image).
accText: Accessibility label.

Relationships (Relational Identifiers) — The Game Changer

insideOf: Specifies a parent container.
containsDescendants: Element must have specified children.
rightOf, leftOf, topOf, bottomOf: Spatial positioning. Not possible with Xpath
index: 1-based (selects the nth match within context).

Example Identifier:

{
  "text": "Submit",
  "insideOf": { "id": "footer" }
}

Clearly identifies the element with Submit text inside the footer container. Works regardless of UI changes as long as the Submit text is inside the footer container.

Checkout the detailed documentation — https://finalrun.gitbook.io/finalrun/advanced/element-identifier

What Makes Relational Identifiers Special?

Contextual Targeting: Define elements by their relation to others, not just absolute position in the hierarchy.
Resiliency: Less impacted by UI layout changes.
Clarity: Self-documenting JSON; easily readable/reusable across teams.
Disambiguation: Combine relationships to uniquely identify elements in complex views.

📅 Book a Demo
See how FinalRun fits into your existing workflow with a live Demo.

5. Practical Scenarios

Scenario 1: Spatial Relationship (Icon Right of “Settings” Text)

UI layout often positions elements side-by-side. For example, an icon may appear visually to the right of a “Settings” label. This spatial relationship is critical but not captured by hierarchy alone.

XPath: (Not possible)

//android.widget.TextView[@text='Settings']/following-sibling::android.widget.ImageView

Limitation with Xpath: This only selects the image that comes after “Settings” in the view hierarchy — not necessarily to its right on screen.

FinalRun:

{
  "type": "icon",
  "rightOf": { "text": "Settings" }
}

FinalRun uses actual screen coordinates to resolve spatial relationships, ensuring the icon is genuinely to the right of the “Settings” label — regardless of their order in the XML layout. XPath cannot guarantee this and often fails when layout shifts or rendering engines change order independently of the DOM.

Scenario 2: Parent Element Containing Specific Children

Let’s say you want to identify a section that contains both a “Title” text and a “Continue” button.

XPath:

//android.view.ViewGroup[.//android.widget.TextView[@text='Title'] and .//android.widget.Button[@text='Continue']]

FinalRun:

{
  "containsDescendants": [
    { "text": "Title" },
    { "text": "Continue" }
  ]
}

FinalRun expresses the same logic in a much clearer, structured form — greatly improving readability and making the intent explicit. XPath handles this by nesting logical conditions in a dense string expression that is harder to maintain.

6. Best Practices with FinalRun Identifiers

Prioritize unique attributes (id, text, accText, type).
Add relational context (insideOf,leftOf, topOf, bottomOf rightOf) to eliminate ambiguity.
Use indexing judiciously — only when necessary

7. Conclusion

FinalRun’s relational, spatial and property-driven identifiers significantly advance UI automation capabilities beyond XPath’s positional approach. By blending clear structure, human readability, and robust context-awareness, FinalRun maximizes both test reliability and developer productivity for today’s fast-moving interfaces.

Embrace relational identifiers to unlock a modern, resilient, and scalable automation strategy.

Read our story behind why we took this new approach to element identification:

How We Set Out to Solve the XPath Problem in Mobile UI Test Automation

Why LLMs Like ChatGPT, Gemini, and Claude Understand FinalRun Identifiers Better Than XPath

📅 Book a Demo
See how FinalRun fits into your existing workflow with a live Demo.

Quest to solve the XPath Problem in Mobile Test Automation

Finalrun — Fri, 18 Jul 2025 09:26:36 GMT

I’m a mobile developer with over 10 years of experience. I’ve led engineering teams at Appunfold, which was acquired by UserIQ, and later at Jiny, which was acquired by Whatfix. In both journeys, I saw firsthand how much pressure teams face to ship high-quality mobile apps across multiple platforms — Native Android, iOS, React Native, Xamarin, Ionic — you name it.

My co-founder Ashish brings a unique perspective to this problem. Back in 2017, he was working at SmartBeings a startup building voice-first smart devices, much like Amazon Alexa. Building futuristic consumer tech at that scale exposed him early to the complexities of UI interaction and the brittleness of automation workflows across hardware and software boundaries.

Together, we saw the same pattern repeat: Testing slowed down development.

We invested in Appium, set up test frameworks, and wrote automation scripts to validate features across all these platforms. But even with all that effort, every small UI change meant hours of rewriting and debugging brittle XPath selectors. Every release introduced breakages. Every redesign threw test stability out the window.

Our automation became a burden rather than an asset. We knew this wasn’t scalable. And worse, skipping those fixes meant shipping bugs to production. That’s when we asked ourselves: Is this just our problem, or does the industry feel the same?

We decided to reach out and speak to 50+ people in the mobile testing and QA space. What we learned shaped everything that followed.

This blog is not about criticizing XPath as a technology. It’s about understanding why XPath, in the way it’s used today in mobile testing, causes instability, pain, and technical debt. More importantly, it’s about how these deep-rooted problems shaped the origin of Finalrun.

1. Brittle and Unstable XPaths: Death by UI Change

This was the most common pain point we heard. Every UI update — big or small — has the potential to break a test. That’s because XPath selectors are tightly coupled to the app’s structure, not its intent.

“DOM structure changes, XPath changes” — Jyothi Yadav, QA Lead at Thomson Reuters

“Flaky Tests. XPath and locator issues. Test maintenance.” — Piyush Sharma, LeadSDET at Gojek

“Locator changes due to UI restructuring frequently break existing tests.” — Anurag Sinha, QA Manager at Hotstar

When tests fail after every sprint, confidence in automation drops. Teams stop trusting their pipelines. Manual QA fills the gap. And automation becomes something people stop investing in.

2. Dynamic UIs Break XPath

Modern mobile apps are highly dynamic — elements appear or shift based on scroll position, data state, and user interaction. XPath expressions, which rely on static structure, don’t survive in this environment.

“Duplicate XPaths: Multiple elements with similar identifiers cause random clicks.” — Sharathchandra Singireddy, SDET at MPL

“Scroll handling and element identification on long, dynamic screens are major pain points.” — Avanti, QA Manager at LeapFinance

“Struggle with locator stability, especially for icon-only buttons or SVG elements.” — Punit, QA Lead at Flobiz

Testers described scenarios where XPath pointed to the wrong icon after layout changes, or missed interactable elements because they hadn’t yet been rendered or scrolled into view.

3. No Good Locators? XPath Becomes the Default Crutch

We discovered that teams don’t choose XPath out of preference, it’s often the fallback when developers don’t expose testable attributes like IDs or content descriptions.

“No ID in web-based UI element — will be difficult to test.” — Ramesh Hosamani, QA Principle Lead, Akamai Technologies

In other words, XPath isn’t the root problem, it’s a symptom of a deeper collaboration gap between dev and QA. Without clean, unique locators, testers are forced to reverse-engineer brittle paths through the UI.

These aren’t just minor inconveniences, they’re structural problems. When locators break, trust in automation breaks. When automation becomes hard to debug, it gets deprioritized. And when test maintenance takes longer than writing the feature, teams stop scaling.

This is why we started Finalrun, not to patch over these issues, but to build a fundamentally better foundation for UI automation. One that makes tests:

Predictable across app updates
Easy to write and debug
Consistent across iOS and Android

We believe that automation should mirror how users actually see and interact with the UI — not how the view hierarchy happens to be structured at runtime.

If you want to know how we are achieving 99% accuracy UI automation with Finalrun. Read the following articles:

The future of UI Element Targetting: Finalrun Identifiers beats Xpath

Why LLMs Like ChatGPT, Gemini, and Claude Understand FinalRun Identifiers Better Than XPath

📅 Book a Demo
See how FinalRun fits into your existing workflow with a live Demo.

Finalrun

Why We Open Sourced Finalrun

The Idea That Actually Worked

Where Test Flows Actually Break Down

The Shift: Tests Should Live With the Code

What This Looks Like in Practice

Why Open Source

🏆 FinalRun achieves the New #1 SOTA Mobile Agent Eclipsing Benchmarks at 97.4%

The Race for the Ultimate UI Agent

1. ⚡ Optimized Hierarchy

2. 📝 Native Text Editing Mastery

3. 🔄 Closed-Loop Failure Feedback

4. 📸 Just-In-Time (JIT) Grounding

The Future of Mobile Automation is Here

Spec-Driven Testing for Mobile apps (Preparing for Open Source Soon)

1. The Nightmare of Mobile Testing

2. Why Are Tests So Brittle?

3. The Solution: Spec-Driven Testing with Finalrun

4. The Human Bottleneck

5. AI "Skills" to the Rescue

6. How the generate-test-spec Skill Works

7. Running the Magic

🚀 Be the First to Try It: We Are Going Open Source!

Why Mobile End-to-End Testing Fails: It's Not the Team's Fault. Its the tools

1. The Time & Deadline Trap

2. The Friction of Mobile Testing

3. The "Quality Gatekeeper" Silo

How Finalrun Realigns the Team on Quality

Related Reading

Mobile End-to-End Testing Tools in 2025

The Foundational Choice: Gray-Box vs. Black-Box Testing

The AI-Powered Approach: FinalRun

The Cross-Platform Champions

Appium: The Veteran with Unmatched Flexibility

Maestro: The Challenger with a Low-Code Revolution

The Native Powerhouses: Speed and Stability

Espresso (Android) & XCUITest (iOS)

The Specialist Tools

Detox: The Go-To for React Native

Making the Right Decision: A Summary

Related Reading

End-to-End Testing for Mobile: Beginners Guide

A User Flow in the App: From Login to Checkout

Multiple App Testing in the Same Flow

App with API or Database to Validate Data

CI/CD for Faster Iteration

The Random Popup Problem: Why Your Mobile Tests Are Flaky

The Problem: When Your Script Can't See the Unexpected

The FinalRun Solution: AI-Powered Context Awareness

What Does This Mean For You?

Get Started with Truly Resilient Mobile Automation

Related Reading

Why LLMs Like ChatGPT, Gemini, and Claude Understand FinalRun Identifiers Better Than XPath

The Core Problem: Why LLMs Struggle With XPath

1. Brittleness and Strict Hierarchies

2. No Spatial or Relational Awareness

3. Cryptic and Unintuitive Syntax

FinalRun Identifiers: Built for Humans and AI

Supported properties:

Supported relationships:

Supported spatial logic:

Real-World Scenarios: FinalRun vs XPath in LLM Generation

🔍 Scenario 1: Spatial Relationship

❌ XPath:

✅ FinalRun:

📦 Scenario 2: Containment Logic

❌ XPath:

✅ FinalRun:

Why FinalRun Works So Well with LLMs

Semantic and Not Structural

Uses JSON LLMs' Native Language

Aligns With Human Thought

Enables Smart, Self-Healing Automation

Conclusion: A New Foundation for AI-Driven Testing

Related Reading

The Future of UI Element Targeting: Why FinalRun Beats XPath

Overview

1. Introduction to Element Identification

2. XPath with Appium: Structure-Based Selection

Strengths