Skip to main content

Command Palette

Search for a command to run...

🏆 FinalRun achieves the New #1 SOTA Mobile Agent Eclipsing Benchmarks at 97.4%

Published
•4 min read

We are thrilled to announce FinalRun, a revolutionary Android UI agent that has officially secured the #1 spot on the leaderboard with an unprecedented 97.4% Success Rate. Not only does FinalRun achieve the highest accuracy ever recorded, but it is fundamentally engineered to be faster and significantly cheaper than any other agent currently on the benchmark list.

By rethinking how AI agents perceive UI hierarchies, handle real-time states, and process errors, FinalRun sets a new gold standard for autonomous mobile execution.


The Race for the Ultimate UI Agent

Building an AI agent that can reliably navigate the dynamic, asynchronous environment of a mobile operating system is notoriously difficult. Traditional agents stumble on UI animations, hallucinate successful actions, and incur massive inference costs due to bloated observation spaces.

With FinalRun, we completely reimagined the perception-action loop. We focused not just on reasoning capability, but on execution reliability and cost efficiency. The result? A staggering 97.4% task completion rate that operates faster and cheaper than the previous state-of-the-art.

Here are the four core breakthroughs that power FinalRun:

1. ⚡ Optimized Hierarchy

The biggest bottleneck for mobile agents is the massive context window required to read Android UI trees (XML). FinalRun utilizes a highly Optimized UI Hierarchy, aggressively pruning irrelevant nodes.

  • Cheaper & Smarter: By pairing this lightweight hierarchy with Gemini Flash and Flash Lite, we drastically reduced token consumption. The cost per task is a fraction of what competing agents require.

  • Blazing Fast: Powered by Gemini Flash and Gemini Flash light

  • Image Stabilization Check: Speed means nothing if the agent clicks while the screen is still loading. FinalRun introduces a deterministic stabilization check—requiring at least 3 consecutive stable image frames—before concluding the UI has settled. This entirely eliminates misclicks caused by mid-animation transitions.

2. 📝 Native Text Editing Mastery

Manipulating text fields is a common failure point for MLLMs, which often try to backspace infinitely or struggle with cursor placement. FinalRun features a dedicated Text Editing Agent that interacts with the OS the same way a human power-user does. When specific text needs to be deleted, the agent executes a Long Press to trigger the Android system’s native context menu, utilizing default options to "Select All" and "Delete". This native approach guarantees 100% precision in text manipulation, avoiding the erratic behavior seen in previous SOTA agents.

3. 🔄 Closed-Loop Failure Feedback

A major flaw in traditional UI agents is the "blind assumption of success"—the agent issues a click command and blindly assumes the task moved forward. FinalRun introduces a strict Failure Feedback Loop. If a grounding action (e.g., attempting to tap a coordinate or element) fails at the system level, this failure is immediately fed back into the agent's context. The agent knows exactly why the action failed, preventing it from hallucinating task completion and allowing it to instantly recalculate a new path to success.

4. 📸 Just-In-Time (JIT) Grounding

Mobile UIs are highly volatile; pop-ups, notifications, or network-delayed renders can change the screen state in milliseconds. Traditional agents evaluate a screenshot, spend seconds "thinking," and then execute a click on a screen that may have already changed. FinalRun solves this race condition by capturing a fresh screenshot immediately before performing any grounding action. This ensures that all coordinates and actions are executed against the absolute latest UI state. This Just-In-Time perception captures errors faster, prevents catastrophic misclicks, and ensures a truly real-time interaction loop.


The Future of Mobile Automation is Here

Achieving 97.4% on complex mobile benchmarks proves that reliable, production-ready mobile AI is no longer a theoretical concept. By combining the raw speed of Gemini Flash, an optimized UI hierarchy, and robust, human-like execution safeguards, FinalRun is unequivocally the fastest, cheapest, and most capable Android agent in the world.

Welcome to the new SOTA.

We’re open-sourcing the agent soon

Notify me when it's open source →