How noqa agent works

Perception → Decision → Action

The noqa agent operates entirely from screenshots — it doesn’t inspect view hierarchies, source code, or accessibility trees. On every step:

Perceive — capture a screenshot and analyze what’s visible: buttons, text, images, current state.

Decide — determine the next action to take toward the test goal.

Act — execute the action and observe the result.

This loop repeats until the goal is reached or the agent determines the test has failed.

Available actions

Action	Description
Tap	Tap any visible element — buttons, links, list items; supports double tap and long tap
Swipe	Swipe in any direction to navigate or dismiss dialogs
Drag	Drag an element from one position to another
Pinch	Two-finger pinch to zoom (maps, images, camera) or resize a selected object (sticker, text, shape) — in to enlarge, out to shrink
Input text	Clear a field and type the specified text
Open link / deeplink	Navigate directly to a URL or in-app deeplink
Terminate app	Force-stop the app
Background app	Send the app to the background
Activate app	Bring the app back to the foreground

All actions work across the entire device — not just inside the app. The agent can interact with system dialogs, purchases, notifications, the home screen, and third-party apps.

Understanding instructions

The agent understands natural language — you don’t need to describe every tap and swipe. Write at the goal level and the agent figures out the individual steps. For example, if your app has a long onboarding with 20–30 screens but each one has just one or two buttons, you can simply write “Complete onboarding” — the agent will go through all the screens on its own without any further instructions.

Learn how to write effective test cases in Writing Good Test Cases.

Speed

On the first run, the agent takes 5–10 seconds per action — it needs to analyze the screen and reason about the next step. On subsequent runs with similar screens, this drops to 1–2 seconds when the agent has already seen similar screens.

In practice, the agent executes most test flows faster than a manual tester would.

Limitations

Timing — the agent takes a screenshot, analyzes it, then acts. If something happens fast — a toast, a transition, a brief loading state — the agent may miss it or not react in time.

Test length — we recommend keeping tests under 100 actions. Longer tests increase the chance of the agent losing context and making incorrect decisions.

Device type — available testing capabilities depend on whether you’re using a local device, simulator, or cloud device. See Local vs Cloud for details.

Platform — cloud tests can be started from the web dashboard. Local tests require a Mac with the noqa desktop app installed.

Guide

Coding agents integration

Best Practices

How noqa agent works

Perception → Decision → Action

Available actions

Understanding instructions

Speed

Limitations

​Perception → Decision → Action

​Available actions

​Understanding instructions

​Speed

​Limitations

Perception → Decision → Action

Available actions

Understanding instructions

Speed

Limitations