Task

A Task is a single automation job executed by an AI agent in a browser environment. Tasks contain your instructions for what the AI should accomplish.

What is a Task?

Examples:

  • “Log into Gmail and count unread emails”
  • “Search for top 10 Hacker News posts and return titles and URLs”
  • “Extract product prices from an e-commerce site”

Key Properties

Required

  • task: Your instruction to the AI agent (1-50,000 characters)
  • llm: AI model (default: "gemini-flash-latest")

Session & Environment

  • sessionId: Session where task runs (optional - auto-created with US proxy if not provided)
  • startUrl: URL to navigate to before starting the task
  • maxSteps: Maximum steps before stopping (default: 30, max: 10,000)

Agent Behavior

  • flashMode: Enable faster execution mode (default: false)
  • thinking: Enable extended thinking for complex tasks (default: false)
  • vision: Enable vision capabilities (default: true, can be true, false, or "auto")
  • highlightElements: Highlight interactive elements (default: false)
  • systemPromptExtension: Custom system prompt addition (max 2,000 chars)

Secrets & Security

  • secrets: Key-value pairs for sensitive data (injected securely)
  • allowedDomains: Restrict navigation to specific domains
  • opVaultId: 1Password vault ID for credential injection

Skills

  • skillIds: List of skill IDs to enable, or ["*"] for all available skills

Judge (Task Evaluation)

  • judge: Enable AI judge to evaluate task success (default: false)
  • judgeGroundTruth: Expected outcome for judge comparison (max 10,000 chars)
  • judgeLlm: LLM model for judge evaluation

Output

  • structuredOutput: JSON schema string for structured response format
  • metadata: Custom key-value pairs (up to 10 pairs)

Response Fields

  • status: created, started, finished, or stopped
  • output: Final result from the agent
  • inputFiles / outputFiles: Files for input/output
  • isSuccess: Agent’s self-reported success status
  • judgement: Judge evaluation result (if judge enabled)
  • judgeVerdict: Boolean verdict from judge

Execution Models

Important: Proxy configuration is a session-level setting, not a task-level setting.

When you create a task without specifying a sessionId, an auto-session is created with US proxy by default. To use a different proxy location, you must:

  1. Create a session first with your desired proxyCountryCode
  2. Then create tasks within that session using sessionId

See Stealth & Proxies for examples.

Auto-Session (Simple)

1import { BrowserUseClient } from "browser-use-sdk";
2
3const client = new BrowserUseClient({ apiKey: "bu_..." });
4
5const task = await client.tasks.createTask({
6 task: "Search for top 10 Hacker News posts",
7 llm: "browser-use-llm"
8});
9
10const result = await task.complete();
11console.log(result.output);

Best for: Simple tasks, no login required, proof of concepts

Auto-sessions use default settings: US proxy, standard browser dimensions. You cannot customize these settings per-task. For custom proxy locations or browser settings, create a session first.

Custom Session (Advanced)

Use custom sessions when you need to configure proxy location, browser dimensions, or run multiple related tasks.

1// Create session with custom proxy location
2const session = await client.sessions.createSession({
3 profileId: "profile_123",
4 proxyCountryCode: "gb" // UK proxy instead of default US
5});
6
7// Upload credentials file
8const credFile = await client.files.upload("credentials.json");
9
10// Run login task
11const loginTask = await client.tasks.createTask({
12 sessionId: session.id,
13 llm: "browser-use-llm",
14 task: "Log into admin dashboard using uploaded credentials",
15 inputFiles: [credFile.id]
16});
17
18await loginTask.complete();
19
20// Run data task (login state preserved)
21const dataTask = await client.tasks.createTask({
22 sessionId: session.id,
23 llm: "browser-use-llm",
24 task: "Export user data as CSV"
25});
26
27const result = await dataTask.complete();

Best for: Multi-step workflows, login required, related tasks, custom proxy locations, custom browser dimensions

Task Control

1// Stream progress
2for await (const update of task.stream()) {
3 console.log(`Status: ${update.status}`);
4}
5
6// Control tasks
7await task.pause();
8await task.resume();
9await task.stop();

Files

1// Input files
2const task = await client.tasks.createTask({
3 task: "Analyze the uploaded image",
4 llm: "browser-use-llm",
5 inputFiles: ["screenshot.png"]
6});
7
8// Output files
9const result = await task.complete();
10for (const file of result.outputFiles) {
11 const data = await client.files.download(file.id);
12}

Best Practices

Task Instructions:

  • Be specific: “Extract product names and prices from first page” vs “get product info”
  • Set boundaries: Specify pages to visit, items to process
  • Include context: Mention login requirements, data format

Performance:

  • Use auto-session for simple tasks
  • Reuse sessions for related tasks

Next Steps