Skip to main content

Sub-Agents

Sub-agents are full agents that can be exposed to a parent agent as callable tools. A parent delegates with a natural-language message; the sub-agent then runs its own agent loop, prompt, tools, memory scope, knowledge-base assignments, model settings, and lifecycle events. The default parent is MainAgent, but the relationship is explicit. A sub-agent is callable only from the parent agents it is assigned to.

Built-in agents

System agents are seeded from code and registered with AgentFactory.register_implementation() during startup:
AgentRoleNotes
MainAgentRoot orchestratorReceives user turns and delegates domain work
EmailsAgentSub-agentEmail search, drafting, sending, replies, and thread work
MeetingsAgentSub-agentCalendar, scheduling, availability, meetings, and transcripts
RecordsAgentSub-agentCRM/account/contact/opportunity and pipeline work
TasksAgentSub-agentTask listing, creation planning, updates, and follow-ups
ResearchAgentSub-agentWeb and account intelligence research
SnowflakeAgentSub-agentSnowflake data analysis and query workflows
JediAgentInternal hidden sub-agentUsed by the jedi_council tool; not exposed to MainAgent because its parent list is empty
There is no built-in GeneralSubAgent. If the orchestrator cannot match a request to an available specialist, it should answer that it lacks the right capability instead of delegating to a catch-all.

Define a system sub-agent

Backend framework extensions use the @sub_agent decorator on a SubAgent subclass:
from config.settings import MAIN_AGENT_NAME
from framework.agents.sub_agent import SubAgent
from framework.decorators import sub_agent

@sub_agent(
    agent_id="3f26857d-3c31-46ee-aabb-c48cc1738f95",
    parents=[MAIN_AGENT_NAME],
    description="Handles web and account intelligence research tasks.",
    role="You are a specialized research sub-agent...",
    enable_reflection=False,
    llm_config={"model": "openai/gpt-5.4-mini"},
    reasoning_effort="low",
)
class ResearchAgent(SubAgent):
    """Handles web and account intelligence research tasks."""
The parents parameter stores parent agent names for bootstrap seeding. It does not immediately mutate an in-memory parent at decoration time. The startup flow imports agent classes, registers implementations in agents/__init__.py, then tenant bootstrap upserts agents and parent-child relationships into the tenant database.

Create a dynamic sub-agent

Tenant-managed sub-agents are created through the API or SDK and do not require Python code:
from agentflow import AsyncAgentFlow

async with AsyncAgentFlow.from_profile("prod") as client:
    parent_id = (await client.agents.by_name("MainAgent")).id

    support_agent = await client.agents.subagents.create(
        parent_id,
        name="SupportAgent",
        description="Escalated customer support specialist",
        system_prompt="You are a customer support specialist...",
        enable_planning=True,
        knowledge_bases=["550e8400-e29b-41d4-a716-446655440000"],
        tool_assignments=["lookup_ticket"],
    )
knowledge_bases are KB IDs. tool_assignments on create/update are tool names stored in sub-agent metadata; the dedicated assign/remove endpoints use tool UUIDs as tool_identifier.

Delegation modes

Sub-agent calls use a small tool schema:
FieldMeaning
messageThe task or question to delegate
modelOptional per-call model override
temperatureOptional per-call temperature override
blockingtrue waits in the current turn; false queues background work

Blocking delegation

With blocking=true, the parent waits for the sub-agent result in the same turn:
  1. The parent emits a tool call named after the sub-agent.
  2. AgentFlow creates a child execution with its own call_id, the parent’s call_id as parent_call_id, and the same root_call_id.
  3. The delegated query is persisted as a user part scoped under the sub-agent call.
  4. The sub-agent receives scoped prior history for that same sub-agent before the current invocation.
  5. The sub-agent runs normally, including its own tools and retrieval.
  6. Its final answer becomes the parent tool result, and the parent continues.

Non-blocking delegation

With blocking=false, the current turn receives a task handle instead of waiting:
{
  "status": "queued",
  "task_id": "sub_agent_call_abc123",
  "tool_call_id": "call_abc123",
  "mode": "non_blocking"
}
The background worker later executes the sub-agent, persists terminal output on the original execution part, saves batch results, and triggers a hidden root-level follow-up turn so the parent can summarize the completion for the user. Root end metrics also include task IDs under custom_metrics.background_tasks.

Parent and sub-agent communication

Blocking calls communicate through the returned tool result. Non-blocking calls add three helper tools on MainAgent:
ToolPurpose
get_background_sub_agent_resultPoll task status and optionally return terminal output
fetch_sub_agent_progressAsk a read-only progress question using an LLM pass over recent sub-agent history
message_sub_agentPersist a parent message for a running background sub-agent, optionally waiting for the next reply
fetch_sub_agent_progress is the non-interfering side channel: it does not mutate the child run, consume a child turn, or interrupt in-flight work. It asks a separate LLM pass to answer from the sub-agent’s recent persisted history. message_sub_agent writes a user part scoped to the sub-agent’s execution. The running sub-agent sees that message at its next turn boundary, replies when useful, and then continues the original delegated task unless the parent changes or cancels it. Set mode="send" to save the message and return immediately, or mode="send_and_wait" to save the message and wait for the next persisted child reply. Questions are a separate user-interaction mechanism. A tool can emit question_required; the run waits for POST /api/v1/questions/{question_id}/respond or the SDK’s client.questions.respond(...). Answers are validated against the emitted question IDs and option IDs. The active waiter is process-local, and pending question requests/responses are mirrored through shared Redis when configured so another worker can accept the response and wake the original run.

Event hierarchy

Every event carries call_id, parent_call_id, and root_call_id:
MainAgent start                call_id=call_root parent=null      root=call_root
ResearchAgent start            call_id=call_research parent=call_root root=call_root
web_search start               call_id=call_search parent=call_research root=call_root
web_search end                 call_id=call_search parent=call_research root=call_root
ResearchAgent delta/end        call_id=call_research parent=call_root root=call_root
MainAgent delta/end            call_id=call_root parent=null      root=call_root
verbose=false suppresses nested tool and sub-agent lifecycle events from the live stream, while root events and user-interaction events still surface. verbose=true emits the nested start, delta, end, and error events so timeline UIs can render the full tree. Depth is bounded by MAX_SUB_AGENT_DEPTH; exceeding it raises an error to prevent recursive delegation loops.

Scoped memory and history

Sub-agents do not inherit the parent’s entire scratchpad. On delegated runs, AgentFlow rebuilds scoped history from persisted conversation parts for the same sub-agent/tool name before the current invocation. This lets TasksAgent remember earlier task-specific delegations without mixing in unrelated EmailsAgent or parent-only reasoning. Context blocks and user memory are also rendered for the sub-agent as its own agent, so cache keys are agent-scoped.

Direct sub-agent runs

You can run a sub-agent directly without asking the parent to delegate:
from agentflow import AsyncAgentFlow, RunOptions

async with AsyncAgentFlow.from_profile("local") as client:
    parent_id = {agent.name: agent.id for agent in await client.agents.list()}["MainAgent"]
    subagents = await client.agents.subagents.list(parent_id)
    research_id = next(agent.id for agent in subagents if agent.name == "ResearchAgent")

    result = await client.agents.subagents.run(
        parent_id,
        research_id,
        message="Research Acme's latest product announcements",
        conversation_id="conv_research_001",
        message_id="msg_research_001",
        options=RunOptions(verbose=True),
    )
REST endpoint:
POST /api/v1/agent/{parent_id}/subagents/{subagent_id}/run
Direct runs create or reuse a conversation, persist the user message, and execute the sub-agent as the root execution for that request. They are useful for diagnostics, playgrounds, and specialist-first workflows. Delegated runs remain the normal orchestration path because the parent decides when and how to combine specialist results.