Overview
The git digest feature reads your repository's commit history and provides structured evidence for the enrichment pipeline. It tracks progress using SHA-based ranges, ensuring each digest captures only new commits since the last session.
How it works
The git digest reads commits from your local git repository and packages them as evidence for enrichment:
- Determine range -- Find the starting SHA from the last session's
sourceReffield - Read commits -- Walk the commit log from the starting SHA to HEAD
- Extract diffs -- Capture code changes for each commit (up to 3000 characters per commit)
- Package evidence -- Structure the data for the enrichment pipeline
SHA-based range tracking
Instead of using dates (which can be unreliable across timezones and clock drift), git digest uses SHA-based ranges:
- The base SHA is read from the last session entry's
sourceReffield - The head SHA is the current HEAD of the repository
- Only commits between these two SHAs are included
This ensures:
- No commits are missed between sessions
- No commits are processed twice
- The digest is deterministic and reproducible
If no previous session exists, the digest captures the most recent commits up to the configured limit.
Running git digest
Via CLI
contox git-digest
Via MCP
Use the contox_git_digest tool:
contox_git_digest(directory: "/path/to/repo")
Options
| Option | Default | Description |
|---|---|---|
directory | Current directory | Path to the git repository root |
limit | 20 | Maximum number of commits to return |
mode | first-parent | Commit traversal mode |
Traversal modes
first-parent (default)
Follows only the first parent of merge commits. This produces a clean "shipping journal" showing the main branch history without merge noise.
contox git-digest --mode first-parent
all
Includes all commits, including those within merged branches. This provides an exhaustive history but can be noisier.
contox git-digest --mode all
Diff capture
For each commit, the digest captures:
- SHA -- The full commit hash
- Message -- The commit message
- Author -- Who made the commit
- Timestamp -- When the commit was made
- Files changed -- List of modified files with change type (added, modified, deleted)
- Diff stats -- Lines added and removed per file
- Smart patches -- Actual code changes, truncated to 3000 characters per commit
The 3000-character diff limit per commit balances detail against token budget. The most significant changes are prioritized.
WIP evidence
The git digest also captures work-in-progress evidence:
- Uncommitted changes -- Modified files that have not been committed
- Staged changes -- Files staged for the next commit
- Untracked files -- New files not yet added to git
This WIP evidence provides additional context about what the developer was working on during the session.
Output structure
{
"commits": [
{
"sha": "abc123def456",
"message": "feat: add JWT authentication middleware",
"author": "Jane Doe",
"timestamp": "2025-01-20T14:00:00Z",
"files": [
{ "path": "src/middleware/auth.ts", "status": "added", "additions": 45, "deletions": 0 },
{ "path": "src/lib/jwt.ts", "status": "added", "additions": 32, "deletions": 0 }
],
"patch": "diff --git a/src/middleware/auth.ts b/src/middleware/auth.ts\n..."
}
],
"headSha": "abc123def456",
"baseSha": "789ghi012jkl",
"wipEvidence": {
"modified": ["src/routes/api.ts"],
"staged": [],
"untracked": ["src/utils/helpers.ts"]
}
}
Next steps
- V2 Pipeline -- How git digest fits into the event flow
- Enrichment -- How digest evidence is processed
- Codebase Scanner -- Complement git history with project structure