Overview
The V2 pipeline is the core data flow in Contox. It captures raw events from your AI coding tools, stores them, and processes them through an AI-powered enrichment pipeline to extract structured memory items.
Pipeline diagram
sequenceDiagram
participant Tool as MCP / CLI / VSCode
participant Ingest as POST /api/v2/ingest
participant Store as Event Store
participant Blob as Blob Storage
participant Session as Session Manager
participant User as Dashboard User
participant Enrich as Enrichment Pipeline
participant Brain as Project Brain
Tool->>Ingest: Send event (HMAC signed)
Ingest->>Store: Store raw event
Ingest->>Blob: Upload blobs (diffs, content)
Ingest->>Session: Create or reuse session (4h window)
Ingest-->>Tool: 202 Accepted (eventId, sessionId)
Note over Session: Events accumulate in session
User->>Enrich: Click "Generate Memory"
Enrich->>Store: Fetch session events
Enrich->>Enrich: Chunk events (10 per chunk)
Enrich->>Enrich: AI extracts memory items
Enrich->>Enrich: Quote verification
Enrich->>Enrich: Dedup against existing items
Enrich->>Brain: Store approved items
Enrich-->>User: Enrichment complete
Note over Brain: Brain document updated
Tool->>Brain: GET /api/v2/brain
Brain-->>Tool: Assembled markdown + metadata
Stage 1: Event capture
Events flow into Contox from three client-side sources:
| Source | Transport | Events |
|---|---|---|
| MCP server | V2 ingest (HMAC) | Session saves, context updates, memory operations |
| CLI | V2 ingest (HMAC) | Session saves, scan results, git digests |
| VS Code extension | V2 ingest (HMAC) | Session saves, file changes, git activity |
All events are sent to POST /api/v2/ingest using HMAC-SHA256 authentication. See HMAC Signing for details.
Event structure
{
"event": "session_save",
"payload": {
"summary": "Implemented JWT authentication",
"changes": [
{
"category": "implementation",
"title": "JWT auth middleware",
"content": "Added auth middleware at src/middleware/auth.ts"
}
]
}
}
Stage 2: Storage and session management
When an event is ingested:
- Raw event stored -- The complete event payload is persisted immediately
- Blobs uploaded -- Large data (diffs, file contents) is stored in blob storage
- Session association -- The event is linked to an existing session or a new session is created
Session windowing
Sessions use a 4-hour window. If an event arrives within 4 hours of the last event in an active session, it is added to that session. Otherwise, a new session is created. This groups related work together naturally.
Stage 3: Enrichment
Enrichment is user-triggered. When you click Generate Memory in the dashboard (or call POST /api/v2/sessions/[id]/enrich), the pipeline begins:
3a. Chunking
Events are grouped into chunks of 10 for processing. This keeps each AI call focused and manageable.
3b. AI extraction
Each chunk is processed by an AI model that extracts structured memory items. The model receives:
- The event data (summaries, changes, diffs)
- The existing project brain (for context)
- A V16 JSON schema defining the expected output format
The AI model tier depends on your plan:
| Plan | Model |
|---|---|
| Free / Personal | Small |
| Team / Business / Enterprise | Medium |
3c. Quote verification
Every extracted item is verified against the source evidence. The system checks that claims made in memory items can be traced back to actual content in the events. Items with hallucinated quotes are rejected. This ensures memory quality.
3d. Deduplication
New items are compared against existing memory items to detect duplicates. Duplicate items are merged, preserving the most confident and most recent information.
3e. Drift check
Existing brain items are checked for consistency with new evidence. If new events contradict an existing memory item, the item is flagged for review.
Stage 4: Brain assembly
After enrichment, approved items are assembled into the project brain document. The brain is served via GET /api/v2/brain with ETag caching and token budgeting.
See Brain Assembly for the complete assembly process.
Monitoring the pipeline
Track pipeline progress in the dashboard:
- Go to Sessions
- Click the session being enriched
- View the Jobs tab for stage-by-stage progress
Each stage shows its status (queued, processing, completed, failed) and duration.
Next steps
- Enrichment -- Deep dive into the enrichment process
- Brain Assembly -- How items become the brain document
- V2 Ingest API -- Ingest endpoint reference