Skip to main content

Write a PREreview

UI-OCEANUS: Scaling GUI Agents with Synthetic Environmental Dynamics

Posted
Server
Preprints.org
DOI
10.20944/preprints202603.0980.v1

Scaling generalist GUI agents is hindered by the data scalability bottleneck of expensive human demonstrations and the ``distillation ceiling'' of synthetic teacher supervision. To transcend these limitations, we propose UI-Oceanus, a framework that shifts the learning focus from mimicking high-level trajectories to mastering interaction physics via ground-truth environmental feedback. Through a systematic investigation of self-supervised objectives, we identify that forward dynamics, defined as the generative prediction of future interface states, acts as the primary driver for scalability and significantly outweighs inverse inference. UI-Oceanus leverages this insight by converting low-cost autonomous exploration, which is verified directly by system execution, into high-density generative supervision to construct a robust internal world model. Experimental evaluations across a series of models demonstrate the decisive superiority of our approach: models utilizing Continual Pre-Training (CPT) on synthetic dynamics outperform non-CPT baselines with an average success rate improvement of 7% on offline benchmarks, which amplifies to a 16.8% gain in real-world online navigation. Furthermore, we observe that navigation performance scales with synthetic data volume. These results confirm that grounding agents in forward predictive modeling offers a superior pathway to scalable GUI automation with robust cross-domain adaptability and compositional generalization.

You can write a PREreview of UI-OCEANUS: Scaling GUI Agents with Synthetic Environmental Dynamics. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now