High-Level Overview
AgentFlow follows a layered architecture with clear separation between the API surface, the core framework, and the persistence layer.Executable Protocol
Agents, tools, and sub-agents all implement theExecutable base class. This gives every component a uniform streaming interface:
Event objects with types like START, DELTA, END, and ERROR. This means the streaming infrastructure works identically regardless of whether content comes from an LLM, a tool, or a sub-agent.
Agent Execution Flow
- Request arrives at a FastAPI endpoint
- Conversation memory is loaded and pruned to fit token limits
- Agent loop begins — the agent builds a prompt, calls the LLM, and streams
DELTAevents - If the LLM requests a tool call, the tool is resolved, validated, and executed (with optional approval gating)
- If the LLM requests a sub-agent, a new execution context is created and delegated
- If retrieval is enabled, relevant knowledge base documents are injected into context
- The agent loops until the LLM produces a final response or hits
max_turns - The response and metadata are persisted, and the SSE stream closes
Mixin-Based Agent Design
TheAgent class is composed from focused mixins:
| Mixin | Responsibility |
|---|---|
PromptMixin | Prompt building and context aggregation |
CapabilityMixin | Planning, reflection, and retrieval features |
PersistenceMixin | Database read/write for agent state |
Executable | The streaming execution protocol |
Service Registry
Framework services are accessed through a centralServiceRegistry:
Multi-Tenant Isolation
Each tenant gets isolated:- Database schema (or separate database)
- Tool registry state — agents can have different tool sets per tenant
- LLM configuration — model, temperature, and token limits per tenant
- Knowledge bases — documents and embeddings are tenant-scoped