Overview

The codebase scanner walks your project filesystem, extracts structural information, and generates approximately 15-20 hierarchical sub-contexts that give the AI an instant understanding of your project. This saves significant time compared to the AI reading every file individually.

What it scans

The scanner extracts information from multiple sources:

Category	What is extracted
Routes	API routes, page routes, middleware
Components	UI components, their props, and relationships
Hooks	Custom hooks with their parameters and return types
Stores	State management stores and their shapes
Libraries	Utility libraries and helper functions
Dependencies	Package dependencies from `package.json`
Configuration	Config files (tsconfig, eslint, tailwind, etc.)
Documentation	README files and inline documentation

Running the scanner

Via CLI

bash

contox scan

The scanner runs in the current directory by default. Specify a different directory:

bash

contox scan --directory /path/to/project

Via MCP

Use the contox_scan tool in your MCP session:

contox_scan(directory: "/path/to/project")

Dry run

Preview what the scanner would generate without writing to the API:

bash

contox scan --dry-run

Filesystem traversal

The scanner respects standard exclusion patterns:

Excluded directories

node_modules
.next
.nuxt
dist
build
.git
coverage
.cache
__pycache__

File size limit

Files larger than 32KB are skipped. This avoids processing generated files, bundled assets, or lock files.

Gitignore support

The scanner respects .gitignore rules. Files and directories listed in your .gitignore are excluded from scanning.

Generated sub-contexts

The scanner generates sub-contexts mapped to schema keys under root/scan/:

Schema key	Content
`root/scan/routes`	API and page routes with methods and paths
`root/scan/components`	Component names, file paths, and prop types
`root/scan/hooks`	Custom hooks with parameters and return types
`root/scan/stores`	State stores and their data shapes
`root/scan/libs`	Utility libraries and functions
`root/scan/deps`	Package dependencies with versions
`root/scan/config`	Configuration files and their key settings
`root/scan/docs`	Documentation files and summaries

Additional sub-contexts may be generated depending on the project structure (e.g., middleware, types, test structure).

Content-hash deduplication

The scanner uses content hashing to avoid creating duplicate sub-contexts:

Each generated sub-context's content is hashed
Before writing, the hash is compared to the existing sub-context's hash
If the hashes match, the sub-context is skipped (no unnecessary update)
If the hashes differ, the sub-context is updated with new content

This makes repeated scans efficient and safe to run frequently.

When to scan

After initial project setup -- Run a scan to give the AI its first understanding of the project
After major refactors -- When the project structure changes significantly
After adding new features -- When new routes, components, or modules are added
Periodically -- Run a scan weekly or after significant changes to keep the brain current

Example output

A scan of a Next.js project might generate:

markdown

## Routes (root/scan/routes)
- GET /api/auth/login
- POST /api/auth/register
- GET /api/projects
- POST /api/projects
- GET /api/projects/[id]
- PATCH /api/projects/[id]
- DELETE /api/projects/[id]

## Components (root/scan/components)
- LoginForm (src/components/auth/login-form.tsx)
- SignupForm (src/components/auth/signup-form.tsx)
- ProjectCard (src/components/projects/project-card.tsx)
- Sidebar (src/components/layout/sidebar.tsx)

## Hooks (root/scan/hooks)
- useAuth (src/hooks/use-auth.ts) - Authentication state and methods
- useProject (src/hooks/use-project.ts) - Project data fetching

Viewing scan results in the Dashboard

After running a scan, browse the generated sub-contexts from the Memory page:

Switch to the Brain tab
Look for schema keys prefixed with root/scan/ — these are the scanner-generated contexts
Expand each node (routes, components, hooks, etc.) to see the extracted information
Each scan item shows its content, state (draft by default), and tier (Layer 3 / archive)
Approve items you want included in the active brain, or leave them as draft for reference

Scan results are also visible in the Search tab via semantic search, so you can query things like "what components exist" or "which API endpoints are available".

Tip

Scan data is classified as Layer 3 (archive) by default, meaning it does not appear in the brain document. It is always accessible via semantic search (contox_search or contox_context_pack). If you want specific scan items in the brain, approve them from the dashboard.

Genesis Scan vs CLI Scanner

Contox offers two ways to analyze your codebase: the CLI Scanner (contox scan) and Genesis Scan. They serve different purposes and complement each other.

Comparison

	CLI Scanner (`contox scan`)	Genesis Scan
How it works	Walks the local filesystem, extracts structural information	Fetches repo from GitHub, runs AI analysis via Gemini
Analysis depth	Structural extraction (routes, components, hooks, deps)	Deep AI analysis across 7 specialized layers
Output	~15--20 hierarchical sub-contexts	Dozens of scored findings with file references
AI cost	None (no AI calls)	Credits consumed for Gemini API calls
Speed	Seconds	Minutes (depends on repo size)
Security audit	No	Yes (OWASP-based, optional)
Importance scoring	No	Yes (1--5 stars per finding)
Selective mode	No (always full scan)	Yes (only re-analyze changed files)
Where it runs	Locally via CLI or MCP	Cloud via dashboard or API

When to use the CLI Scanner

Quick orientation -- You need a fast structural overview of a new project
Frequent updates -- You want to re-scan after every feature without AI cost
Local-only projects -- The codebase is not on GitHub
CI/CD pipelines -- Automated scans in build pipelines where speed matters

When to use Genesis Scan

Deep analysis -- You need the AI to understand business logic, conventions, and architecture
Security review -- You want OWASP-based vulnerability detection
Onboarding -- A new team member needs comprehensive project understanding
Major milestones -- After significant refactors or before releases

Using them together

The CLI Scanner and Genesis Scan are designed to complement each other:

CLI Scanner gives structure -- Run contox scan to populate the brain with routes, components, hooks, and dependencies. This gives the AI a structural map of the project.
Genesis Scan gives understanding -- Run a Genesis Scan to add deep insights about business logic, architecture patterns, coding conventions, and security. This gives the AI the "why" behind the code.
Keep both current -- Run CLI scans frequently (they are free and fast). Run Genesis scans at key milestones or when you need a fresh deep analysis.

Tip

Start with a CLI scan for immediate structure, then follow up with a Genesis Scan for deep analysis. The CLI scan data helps Genesis produce better findings because the brain already has structural context.

Next steps

Genesis Scan -- How Genesis Scan works in detail
Genesis Dashboard -- Using the Genesis UI
Git Digest -- Complement scans with git history
Brain Assembly -- How scan data appears in the brain
Best Practices -- When and how often to scan

Hygiene Agent

Git Digest

On this page