agentcmd
ReferenceWorkflow Steps

Agent Step

Execute AI CLI tools with full interactive capabilities including file editing, tool use, and multi-turn conversations.

Overview

The agent step runs Claude Code, OpenAI Codex, or Google Gemini CLI tools. Unlike the AI step, agent steps are interactive and can use tools (file editing, searches, commands).

Use when:

  • Need file editing capabilities
  • Want multi-turn interactive conversation
  • Task requires tool use
  • Implementation takes >5 minutes

Don't use when:

  • Just need quick AI response → Use ai step
  • Need structured data only → Use ai step with schema
  • Want faster execution → Agent steps timeout at 30min vs 5min for AI

Configuration

interface AgentStepConfig {
  agent: "claude" | "codex" | "gemini";
  prompt: string;
  workingDir?: string;
  context?: Record<string, unknown>;
  permissionMode?: "default" | "plan" | "acceptEdits" | "bypassPermissions";
  mcpConfig?: string[];
  json?: boolean;
  resume?: string;
}

Timeout: 30 minutes (1,800,000ms)

Parameters

agent (required)

  • "claude" - Claude Code CLI (most capable, recommended)
  • "codex" - OpenAI Codex CLI (good for code generation)
  • "gemini" - Google Gemini CLI (experimental, ~70% stable)

prompt (required)

  • Instruction for the AI agent
  • Can be multi-line, detailed
  • Include context, requirements, constraints

workingDir (optional)

  • Directory where agent executes
  • Defaults to event.data.projectPath
  • Agent can read/write files in this directory

context (optional)

  • Additional context object passed to agent
  • Useful for structured data (IDs, configs, etc.)

permissionMode (optional, default: "default")

  • "default" - Interactive, prompt for file edits
  • "plan" - Read-only, no file modifications
  • "acceptEdits" - Auto-approve all file edits
  • "bypassPermissions" - Skip all permission checks (dangerous!)

json (optional, default: false)

  • Extract JSON from agent response
  • Agent must return valid JSON in response
  • Useful for structured output

resume (optional)

  • Session ID from previous agent step
  • Continues the conversation
  • Context carries forward (files read, decisions made)

mcpConfig (optional, Claude only)

  • Array of MCP server configuration file paths
  • Enables Model Context Protocol servers for enhanced capabilities
  • Paths relative to workingDir (e.g., [".mcp.json.github", ".mcp.json.playwright"])
  • Only supported by Claude agent (ignored for Codex and Gemini)
  • See MCP example below

Basic Usage

Simple Agent Call

await step.agent("implement-feature", {
  agent: "claude",
  prompt: "Implement user authentication with JWT tokens",
  workingDir: projectPath,
});

With Permission Mode

// Read-only analysis
await step.agent("analyze-code", {
  agent: "claude",
  prompt: "Analyze this codebase for security vulnerabilities",
  permissionMode: "plan", // No file edits
});

With Auto-Approve Edits

// Auto-approve all edits (use carefully!)
await step.agent("refactor", {
  agent: "claude",
  prompt: "Refactor auth module to use async/await",
  permissionMode: "acceptEdits",
});

Advanced Usage

Extract JSON Response

interface AnalysisResult {
  complexity: number;
  issues: Array<{ severity: string; description: string }>;
  recommendations: string[];
}

const result = await step.agent<AnalysisResult>("analyze", {
  agent: "claude",
  prompt: `Analyze this codebase and return JSON with this structure:
{
  "complexity": <1-10 score>,
  "issues": [{ "severity": "high|medium|low", "description": "..." }],
  "recommendations": ["...", "..."]
}`,
  json: true,
  permissionMode: "plan",
});

// result.data is typed as AnalysisResult
console.log(`Complexity: ${result.data.complexity}`);
console.log(`Found ${result.data.issues.length} issues`);

Resume Previous Session

const ctx: { planSession?: string } = {};

// Phase 1: Plan
await step.phase("plan", async () => {
  const result = await step.agent("architect", {
    agent: "claude",
    prompt: "Design the authentication system",
    permissionMode: "plan",
  });
  ctx.planSession = result.data.sessionId;
});

// Phase 2: Implement (continues planning conversation)
await step.phase("implement", async () => {
  await step.agent("code", {
    agent: "codex",
    prompt: "Implement the auth design from the planning phase",
    resume: ctx.planSession,
  });
});

With Slash Commands

import { buildSlashCommand } from "agentcmd-workflows";

await step.agent("generate-spec", {
  agent: "claude",
  json: true,
  prompt: buildSlashCommand("/cmd:generate-feature-spec", {
    context: "User authentication with OAuth2"
  }),
});

With Additional Context

await step.agent("implement-endpoint", {
  agent: "claude",
  prompt: "Implement the GET /users/:id endpoint",
  context: {
    userId: "user-123",
    apiVersion: "v2",
    authRequired: true,
  },
});

With MCP Servers

Enable Model Context Protocol (MCP) servers to extend agent capabilities with custom tools and resources:

// MCP server configuration example
// .mcp.json.github
{
  "mcpServers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_TOKEN": "ghp_..."
      }
    }
  }
}

// Use MCP servers in workflow
await step.agent("analyze-prs", {
  agent: "claude",
  prompt: "Analyze recent pull requests and create a summary report",
  mcpConfig: [
    ".mcp.json.github",
    ".mcp.json.playwright"
  ],
  workingDir: projectPath,
});

What MCP provides:

  • Custom tools - GitHub API access, database queries, browser automation
  • Resources - File system access, API endpoints, data sources
  • Enhanced prompts - Domain-specific prompt templates

Common MCP servers:

  • @modelcontextprotocol/server-github - GitHub API integration
  • @modelcontextprotocol/server-filesystem - Enhanced file operations
  • @modelcontextprotocol/server-postgres - PostgreSQL database access
  • @playwright/mcp - Browser automation and testing

Return Value

Agent steps return AgentStepResult:

interface AgentStepResult<T = unknown> {
  data: T;                    // Extracted data (JSON mode) or response text
  sessionId: string;          // Session ID for resumption
  exitCode: number;           // 0 = success, non-zero = error
  tool: "claude" | "codex" | "gemini";
}

Example:

const result = await step.agent("code", {
  agent: "claude",
  prompt: "Implement feature X",
});

console.log(result.sessionId); // "abc123..."
console.log(result.exitCode);  // 0
console.log(result.tool);      // "claude"

Error Handling

Try/Catch

try {
  await step.agent("risky-operation", {
    agent: "claude",
    prompt: "Refactor critical system",
  });
} catch (error) {
  await step.annotation("error", {
    message: `Agent failed: ${error}. Rolling back...`,
  });
  // Rollback logic
}

Timeout Override

await step.agent("long-task", {
  agent: "claude",
  prompt: "Implement entire feature from spec",
}, {
  timeout: 3600000, // 60 minutes (default is 30min)
});

Check Exit Code

const result = await step.agent("build-feature", {
  agent: "claude",
  prompt: "Implement feature X",
});

if (result.exitCode !== 0) {
  throw new Error("Agent execution failed");
}

Agent Comparison

FeatureClaudeCodexGemini
Stability✓✓✓✓✓✓✓✓ (70%)
Planning✓✓✓ Excellent✓✓ Good✓✓ Good
Coding✓✓✓ Excellent✓✓✓ Excellent✓✓ Good
Reasoning✓✓✓ Best✓✓ Good✓✓ Good
Speed✓✓ Moderate✓✓✓ Fast✓✓ Moderate
Cost$$$$$

Recommendation: Start with Claude for planning and complex tasks, use Codex for pure code generation.

Common Patterns

Claude Plans, Codex Implements

const ctx: { planSession?: string } = {};

// Claude analyzes and plans
const plan = await step.agent("plan", {
  agent: "claude",
  prompt: "Design the feature architecture",
  permissionMode: "plan",
});
ctx.planSession = plan.data.sessionId;

// Codex implements
await step.agent("implement", {
  agent: "codex",
  prompt: "Implement the design from planning",
  resume: ctx.planSession,
});

Iterative Refinement

let session: string | undefined;

for (let i = 1; i <= 3; i++) {
  const result = await step.agent(`attempt-${i}`, {
    agent: "claude",
    prompt: session
      ? "Fix issues from previous attempt"
      : "Implement feature X",
    resume: session,
  });
  session = result.data.sessionId;

  // Check if tests pass
  const test = await step.cli("test", { command: "pnpm test" });
  if (test.exitCode === 0) break;
}

Multi-Agent Review

// Codex implements
const impl = await step.agent("implement", {
  agent: "codex",
  prompt: "Implement user authentication",
});

// Claude reviews
const review = await step.agent("review", {
  agent: "claude",
  prompt: "Review the implementation for security issues",
  resume: impl.data.sessionId,
  permissionMode: "plan",
});

// Codex fixes issues
await step.agent("fix", {
  agent: "codex",
  prompt: "Fix the issues found in review",
  resume: review.data.sessionId,
});

Best Practices

Write Clear Prompts

Good - Specific, actionable:

prompt: `Implement JWT authentication for the /api/auth endpoints.
Requirements:
- Use jsonwebtoken library
- Tokens expire in 24 hours
- Store secret in environment variable
- Add tests for token generation and validation`

Bad - Vague:

prompt: "Add authentication"

Use Appropriate Permission Modes

// Analysis/planning - read-only
permissionMode: "plan"

// Interactive development - get approval
permissionMode: "default"

// Automated workflows - auto-approve (carefully!)
permissionMode: "acceptEdits"

// Never use in production:
permissionMode: "bypassPermissions" // Dangerous!

Save Session IDs for Resumption

interface Context {
  sessions: {
    plan?: string;
    implement?: string;
  };
}

const ctx: Context = { sessions: {} };

const result = await step.agent("plan", { ... });
ctx.sessions.plan = result.data.sessionId; // Save it!

Match Tools to Tasks

  • Claude: Planning, architecture, complex reasoning
  • Codex: Pure code generation, implementation
  • Gemini: Experimental, cost-sensitive projects

Next Steps