Skip to main content

Batch Processing

AgentFlow provides a batch processing pipeline for running LLM operations across large datasets. Classify, extract, summarize, or enrich hundreds of items — synchronously for small batches or asynchronously with downloadable results for large runs. Each operation has a dedicated endpoint with purpose-built request models, built-in system prompts, and structured output enforcement — so you don’t need to write prompt engineering boilerplate.

Operations

Classification / Labeling

Categorize items against a defined label set. Supports both single-label (exactly one category) and multi-label (all applicable categories) modes.
POST /api/completions/label
{
  "items": [
    { "content": "Customer says product is amazing but delivery was slow" },
    { "content": "Terrible experience, nothing works as advertised" },
    { "content": "Good value for the price, would recommend" }
  ],
  "categories": ["positive", "negative", "mixed"],
  "multi_label": false
}
Response (single-label):
{
  "results": [
    { "label": "mixed", "confidence": 0.85 },
    { "label": "negative", "confidence": 0.95 },
    { "label": "positive", "confidence": 0.90 }
  ],
  "model": "openai/gpt-4.1-mini"
}
Response (multi-label with "multi_label": true):
{
  "results": [
    { "labels": ["positive", "mixed"], "confidence": 0.82 }
  ]
}
The system prompt, JSON schema, and output format are all generated automatically from your categories list. You can override with system_prompt and user_prompt_template for full control.

Summarization

Generate concise summaries with customizable instructions.
POST /api/completions/summarize
{
  "items": [
    { "content": "Full transcript of 45-minute customer call..." }
  ],
  "summary_instructions": "Summarize the key decisions, action items, and any risk signals. Keep under 200 words.",
  "temperature": 0.3
}
Response:
{
  "results": [
    { "result": "The call covered three main topics: 1) Renewal pricing..." }
  ],
  "model": "openai/gpt-4.1-mini"
}
By default, summaries are returned as plain text. Pass response_format: {"type": "json_object"} for structured JSON summaries.

Extraction

Pull structured fields from unstructured text. Returns JSON by default.
POST /api/completions/extract
{
  "items": [
    { "content": "Hi, I'm Sarah Chen, VP of Engineering at Acme Corp. We're looking to migrate 500 seats by Q3." }
  ],
  "extraction_instructions": "Extract: person_name, title, company, seat_count, timeline",
  "temperature": 0.0
}
Response:
{
  "results": [
    {
      "person_name": "Sarah Chen",
      "title": "VP of Engineering",
      "company": "Acme Corp",
      "seat_count": 500,
      "timeline": "Q3"
    }
  ]
}

Enrichment

Add computed fields to existing data using a typed field schema. Each field defines its name, type, description, and optional constraints (valid values, examples).
POST /api/completions/enrich
{
  "items": [
    {
      "content": "Enterprise SaaS company, 500 employees, Series C",
      "metadata": { "company": "Acme Corp", "industry": "Technology" }
    }
  ],
  "fields": [
    {
      "name": "company_segment",
      "type": "categorical",
      "description": "Market segment",
      "valid_values": ["enterprise", "mid-market", "smb"]
    },
    {
      "name": "growth_stage",
      "type": "categorical",
      "description": "Company growth stage",
      "valid_values": ["seed", "early", "growth", "mature"]
    },
    {
      "name": "estimated_arr",
      "type": "numeric",
      "description": "Estimated ARR range in millions"
    }
  ]
}
Response:
{
  "results": [
    {
      "company_segment": "enterprise",
      "growth_stage": "growth",
      "estimated_arr": 25
    }
  ]
}
Enrichment supports runtime_categories for dynamic valid-value overrides and include_context to inject item metadata into the prompt.

General-purpose completions

The /run endpoint accepts any combination of system prompt, user template, and response format — use it for custom operations that don’t fit the dedicated endpoints.
POST /api/completions/run
{
  "items": [
    { "content": "Customer feedback text..." }
  ],
  "system_prompt": "You analyze customer feedback for product teams.",
  "user_prompt_template": "Analyze this feedback:\n\n{content}\n\nIdentify: 1) Feature requests 2) Pain points 3) Competitive mentions",
  "response_format": { "type": "json_object" },
  "model": "openai/gpt-4.1-mini",
  "temperature": 0.2
}
Each item can include a metadata dict with arbitrary fields accessible in the prompt template via {field_name}.

Asynchronous batches

For larger datasets (over 50 items), submit async jobs and poll for results.

Submit a batch

POST /api/batches
{
  "items": [...],
  "operation": "extract",
  "extraction_instructions": "Extract company name, role, and sentiment",
  "model": "openai/gpt-4.1-mini"
}
{ "batch_id": "batch_abc123", "status": "queued", "total_items": 500 }

Check progress

GET /api/batches/{batch_id}
{
  "batch_id": "batch_abc123",
  "status": "processing",
  "total_items": 500,
  "completed_items": 342,
  "progress_percentage": 68.4
}

Download results

GET /api/batches/{batch_id}/output?format=json
Supported formats: json, jsonl, csv. Results up to 10,000 items.

Cancel a batch

POST /api/batches/{batch_id}/cancel

File upload

Upload files directly for batch processing — CSV, JSON, or plain text up to 100MB (up to 1M items):
POST /api/files
Content-Type: multipart/form-data

file=@customers.csv
operation=classify
categories=enterprise,mid-market,smb
The file is parsed, each row becomes an item, and the batch runs asynchronously with the same output/polling flow as /api/batches.

API summary

Dedicated operation endpoints (sync, up to 50 items)

MethodPathOperation
POST/api/completions/labelClassification (single or multi-label)
POST/api/completions/summarizeSummarization with custom instructions
POST/api/completions/extractStructured extraction to JSON
POST/api/completions/enrichSchema-driven field enrichment
POST/api/completions/runGeneral-purpose (any prompt template)

Async batch endpoints

MethodPathDescription
POST/api/batchesSubmit async batch job
GET/api/batches/{batch_id}Get batch status and progress
GET/api/batches/{batch_id}/outputDownload results (JSON/JSONL/CSV)
POST/api/batches/{batch_id}/cancelCancel a running batch
POST/api/filesUpload file for batch processing