Agents
Run Agent
Execute an agent and stream the response
POST
Run Agent
Runs the agent with the given input and returns a Server-Sent Events stream when
Generate
Inline blocks are wrapped and persisted with the hidden model input for this turn. They are not shown as the user-visible message in normal chat timelines.
Text is streamed through
stream is true.
All requests require Authorization: Bearer <token>.
Path Parameters
| Parameter | Type | Description |
|---|---|---|
id | string | The agent’s unique identifier |
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
message | string | yes | The user’s input |
conversation_id | string | yes | Conversation ID for multi-turn context |
message_id | string | yes | Stable unique ID for this user message |
stream | bool | no | Return SSE events when true; default is true |
verbose | bool | no | Include nested tool/sub-agent events |
message_id once per user action and reuse it when retrying that same submit after a timeout or reconnect. Do not generate a new message ID for every network retry.
Advanced request fields accepted by the backend include attachment_ids, knowledge_bases, retrieval_options, skills, artifacts, context_refs, prompt_blocks, memory_policy, model, temperature, max_tokens, top_p, frequency_penalty, presence_penalty, max_turns, reasoning_effort, reasoning_summary, image_detail, enable_planning, enable_reflection, enable_query_rewrite, selected_tools, client_timezone, session_context, source, source_id, source_metadata, and call_id.
knowledge_bases performs request-scoped KB search before the model call and injects matching snippets as hidden system_context. Model-callable retrieval is handled through the normal tool system when a retrieval tool is registered or assigned to the agent.
memory_policy controls prompt-time memory injection for this run:
archival_mode defaults to tool_only, which keeps archival memory out of the prompt unless the agent recalls it. Set it to auto to inject bounded archival matches for the current message as hidden current-turn system_reminders. Core user and agent memory are included by default and can be disabled with include_user_core / include_agent_core.
call_id is a client correlation value stored with the conversation run. The server still generates the canonical runtime call_id / root_call_id used for streaming and cancellation.
prompt_blocks supplies request-time model-only context:
Response (SSE Stream)
delta events, and the final response arrives in the root end event. There is no content event type.
Tools and artifacts may also stream live delta events with exact content types: tool_progress, tool_result_delta, artifact_progress, and artifact_result_delta. Progress describes what is happening; result deltas are partial output that may be replaced by the final result. Do not use UI hints such as show_in_panel as stream classifiers.
Artifacts stream as structured lifecycle events (artifact_started, artifact_progress, artifact_completed, artifact_error). Do not parse inline artifact markers from text deltas; fetch the persisted artifact by artifact_id.
seq is the JSON payload ordering field. AgentFlow also emits SSE id: lines when an event has event_id, sse_id, or seq; persist the latest SSE id for reconnect hints and recover UI state from the durable conversation snapshot stream.
Error Events
If an error occurs during execution, anerror event is streamed:
Run Agent

