Skip to main content

Perception → Decision → Action

The noqa agent operates entirely from screenshots — it doesn’t inspect view hierarchies, source code, or accessibility trees. On every step:
  1. Perceive — capture a screenshot and analyze what’s visible: buttons, text, images, current state.
  2. Decide — determine the next action to take toward the test goal.
  3. Act — execute the action and observe the result.
This loop repeats until the goal is reached or the agent determines the test has failed.

Available actions

ActionDescription
TapTap any visible element — buttons, links, list items; supports double tap and long tap
SwipeSwipe in any direction to navigate or dismiss dialogs
DragDrag an element from one position to another
Input textClear a field and type the specified text
Open link / deeplinkNavigate directly to a URL or in-app deeplink
Terminate appForce-stop the app
Background appSend the app to the background
Activate appBring the app back to the foreground
All actions work across the entire device — not just inside the app. The agent can interact with system dialogs, purchases, notifications, the home screen, and third-party apps.

Understanding instructions

The agent understands natural language — you don’t need to describe every tap and swipe. Write at the goal level and the agent figures out the individual steps. For example, if your app has a long onboarding with 20–30 screens but each one has just one or two buttons, you can simply write “Complete onboarding” — the agent will go through all the screens on its own without any further instructions.
Learn how to write effective test cases in Writing Good Test Cases.

Speed

On the first run, the agent takes 5–10 seconds per action — it needs to analyze the screen and reason about the next step. On subsequent runs with similar screens, this drops to 1–2 seconds when the agent has already seen similar screens. In practice, the agent executes most test flows faster than a manual tester would.

Limitations

  • Timing — the agent takes a screenshot, analyzes it, then acts. If something happens fast — a toast, a transition, a brief loading state — the agent may miss it or not react in time.
  • Test length — we recommend keeping tests under 100 actions. Longer tests increase the chance of the agent losing context and making incorrect decisions.
  • Device type — available testing capabilities depend on whether you’re using a local device, simulator, or cloud device. See Local vs Cloud for details.
  • Platform — cloud tests can be started from the web dashboard. Local tests require a Mac with the noqa desktop app installed.