Overview
The codebase scanner walks your project filesystem, extracts structural information, and generates approximately 15-20 hierarchical sub-contexts that give the AI an instant understanding of your project. This saves significant time compared to the AI reading every file individually.
What it scans
The scanner extracts information from multiple sources:
| Category | What is extracted |
|---|---|
| Routes | API routes, page routes, middleware |
| Components | UI components, their props, and relationships |
| Hooks | Custom hooks with their parameters and return types |
| Stores | State management stores and their shapes |
| Libraries | Utility libraries and helper functions |
| Dependencies | Package dependencies from package.json |
| Configuration | Config files (tsconfig, eslint, tailwind, etc.) |
| Documentation | README files and inline documentation |
Running the scanner
Via CLI
contox scan
The scanner runs in the current directory by default. Specify a different directory:
contox scan --directory /path/to/project
Via MCP
Use the contox_scan tool in your MCP session:
contox_scan(directory: "/path/to/project")
Dry run
Preview what the scanner would generate without writing to the API:
contox scan --dry-run
Filesystem traversal
The scanner respects standard exclusion patterns:
Excluded directories
node_modules.next.nuxtdistbuild.gitcoverage.cache__pycache__
File size limit
Files larger than 32KB are skipped. This avoids processing generated files, bundled assets, or lock files.
Gitignore support
The scanner respects .gitignore rules. Files and directories listed in your .gitignore are excluded from scanning.
Generated sub-contexts
The scanner generates sub-contexts mapped to schema keys under root/scan/:
| Schema key | Content |
|---|---|
root/scan/routes | API and page routes with methods and paths |
root/scan/components | Component names, file paths, and prop types |
root/scan/hooks | Custom hooks with parameters and return types |
root/scan/stores | State stores and their data shapes |
root/scan/libs | Utility libraries and functions |
root/scan/deps | Package dependencies with versions |
root/scan/config | Configuration files and their key settings |
root/scan/docs | Documentation files and summaries |
Additional sub-contexts may be generated depending on the project structure (e.g., middleware, types, test structure).
Content-hash deduplication
The scanner uses content hashing to avoid creating duplicate sub-contexts:
- Each generated sub-context's content is hashed
- Before writing, the hash is compared to the existing sub-context's hash
- If the hashes match, the sub-context is skipped (no unnecessary update)
- If the hashes differ, the sub-context is updated with new content
This makes repeated scans efficient and safe to run frequently.
When to scan
- After initial project setup -- Run a scan to give the AI its first understanding of the project
- After major refactors -- When the project structure changes significantly
- After adding new features -- When new routes, components, or modules are added
- Periodically -- Run a scan weekly or after significant changes to keep the brain current
Example output
A scan of a Next.js project might generate:
## Routes (root/scan/routes)
- GET /api/auth/login
- POST /api/auth/register
- GET /api/projects
- POST /api/projects
- GET /api/projects/[id]
- PATCH /api/projects/[id]
- DELETE /api/projects/[id]
## Components (root/scan/components)
- LoginForm (src/components/auth/login-form.tsx)
- SignupForm (src/components/auth/signup-form.tsx)
- ProjectCard (src/components/projects/project-card.tsx)
- Sidebar (src/components/layout/sidebar.tsx)
## Hooks (root/scan/hooks)
- useAuth (src/hooks/use-auth.ts) - Authentication state and methods
- useProject (src/hooks/use-project.ts) - Project data fetching
Viewing scan results in the Dashboard
After running a scan, browse the generated sub-contexts from the Memory page:
- Switch to the Brain tab
- Look for schema keys prefixed with
root/scan/— these are the scanner-generated contexts - Expand each node (routes, components, hooks, etc.) to see the extracted information
- Each scan item shows its content, state (draft by default), and tier (Layer 3 / archive)
- Approve items you want included in the active brain, or leave them as draft for reference
Scan results are also visible in the Search tab via semantic search, so you can query things like "what components exist" or "which API endpoints are available".
Scan data is classified as Layer 3 (archive) by default, meaning it does not appear in the brain document. It is always accessible via semantic search (contox_search or contox_context_pack). If you want specific scan items in the brain, approve them from the dashboard.
Genesis Scan vs CLI Scanner
Contox offers two ways to analyze your codebase: the CLI Scanner (contox scan) and Genesis Scan. They serve different purposes and complement each other.
Comparison
CLI Scanner (contox scan) | Genesis Scan | |
|---|---|---|
| How it works | Walks the local filesystem, extracts structural information | Fetches repo from GitHub, runs AI analysis via Gemini |
| Analysis depth | Structural extraction (routes, components, hooks, deps) | Deep AI analysis across 7 specialized layers |
| Output | ~15--20 hierarchical sub-contexts | Dozens of scored findings with file references |
| AI cost | None (no AI calls) | Credits consumed for Gemini API calls |
| Speed | Seconds | Minutes (depends on repo size) |
| Security audit | No | Yes (OWASP-based, optional) |
| Importance scoring | No | Yes (1--5 stars per finding) |
| Selective mode | No (always full scan) | Yes (only re-analyze changed files) |
| Where it runs | Locally via CLI or MCP | Cloud via dashboard or API |
When to use the CLI Scanner
- Quick orientation -- You need a fast structural overview of a new project
- Frequent updates -- You want to re-scan after every feature without AI cost
- Local-only projects -- The codebase is not on GitHub
- CI/CD pipelines -- Automated scans in build pipelines where speed matters
When to use Genesis Scan
- Deep analysis -- You need the AI to understand business logic, conventions, and architecture
- Security review -- You want OWASP-based vulnerability detection
- Onboarding -- A new team member needs comprehensive project understanding
- Major milestones -- After significant refactors or before releases
Using them together
The CLI Scanner and Genesis Scan are designed to complement each other:
- CLI Scanner gives structure -- Run
contox scanto populate the brain with routes, components, hooks, and dependencies. This gives the AI a structural map of the project. - Genesis Scan gives understanding -- Run a Genesis Scan to add deep insights about business logic, architecture patterns, coding conventions, and security. This gives the AI the "why" behind the code.
- Keep both current -- Run CLI scans frequently (they are free and fast). Run Genesis scans at key milestones or when you need a fresh deep analysis.
Start with a CLI scan for immediate structure, then follow up with a Genesis Scan for deep analysis. The CLI scan data helps Genesis produce better findings because the brain already has structural context.
Next steps
- Genesis Scan -- How Genesis Scan works in detail
- Genesis Dashboard -- Using the Genesis UI
- Git Digest -- Complement scans with git history
- Brain Assembly -- How scan data appears in the brain
- Best Practices -- When and how often to scan