Add corgea deps offline inventory (scan/graph/explain/diff/sbom/policy) by juangaitanv · Pull Request #93 · Corgea/cli

juangaitanv · 2026-05-28T16:29:14Z

Summary

Chunk 2 of the dependency tooling split (#89). Stacked on #92 (chunk1/harness) — review/merge that first; base will retarget to main once #92 lands.

Adds the fully offline corgea deps inventory surface for npm, Python, and Java:

Detects manifests + lockfiles, builds the resolved dependency graph.
Evaluates a pinning policy (DEP rules) with table / JSON / SARIF / CycloneDX output.
Subcommands: scan, graph, explain, diff, sbom, policy.
No token, no config, no network — wired into main.rs with no auth/token check.

Out of scope (chunk 3)

The network surface stays deferred: deps verify, registry freshness, --check-cve / vuln-api client, the vuln-api-stub binary, and the npm/pip/etc. install wrappers.

Coverage

83 unit + integration tests. Overall line coverage 13% → 36% (well above the 13% floor from chunk 1).

Test plan

CI green once Add ./harness quality contract + CI coverage gate #92 merges and this retargets to main
Local: ./harness ci passes (clippy strict, fmt, tests + coverage)
corgea deps scan / graph / explain / diff / sbom / policy run with no token or config present
Offline contract holds: no network calls in any deps subcommand

Offline dependency inventory for npm, Python, and Java: detects manifests and lockfiles, builds the resolved graph, and evaluates a pinning policy (DEP rules) with table/JSON/SARIF/CycloneDX output. Fully offline — no token or network. Carved out of #89 as chunk2 (stacks on the chunk1 harness branch). Excludes the network surface deferred to chunk3: `deps verify`, registry freshness, --check-cve / vuln-api, the vuln-api-stub binary, and the npm/pip/etc. install wrappers. Wires `corgea deps <subcommand>` into main.rs with no auth/token check. Adds 83 unit + integration tests; overall line coverage 13% -> 36%.

cursor

Stale comment

cursor · 2026-05-28T16:43:23Z

+    inv.findings.iter().any(|f| f.severity.at_least(sev))
+}
+
+fn scan_base_ref(_path: &str, _base: &str) -> Result<crate::deps::Inventory, DepsError> {


This cannot ship as the implementation for corgea deps diff: both _path and _base are ignored and the base inventory is always empty. Evidence: the diff command scans HEAD, then calls this stub, so even corgea deps diff --base HEAD reports every current dependency as added and never reports real removals/changes. Impact: CI/policy checks based on dependency diffs are unusable and --fail-on-new will be evaluated against the wrong baseline. Fix by materializing/scanning the requested git ref for the requested path (for example via a temporary worktree or reading files from the git object database), then add CLI tests where comparing against the same ref yields an empty diff and a real base/head change yields only the actual additions/removals/updates.

cursor · 2026-05-28T16:43:23Z

+            out_format,
+            out_file,
+        } => {
+            let inv = scan(Path::new(&path), &Policy::default())?;


The CLI never loads the policy file that corgea deps policy init creates. Evidence: every command path calls scan(..., &Policy::default()), so a user can set .corgea/deps.yml to disable fail_on_wildcard or adjust other policy settings and corgea deps scan will still use the hard-coded defaults. Impact: the advertised policy feature produces false positives/false negatives in CI and users cannot rely on the generated config. Fix by loading .corgea/deps.yml from the scan root when present, failing closed on invalid YAML, and passing that Policy through scan/graph/explain/diff/sbom; add CLI coverage proving a generated/edited policy changes scan findings.

cursor · 2026-05-28T16:43:23Z

+            let output = match format {
+                "json" => to_json(&inv).to_string(),
+                "sarif" => to_sarif(&inv).to_string(),
+                _ => {


Unknown output formats are silently accepted as table output. Evidence: out_format is an unrestricted Option<String>, and this wildcard arm handles every value other than json/sarif; with --out-format typo --out-file out, the code prints the table to stdout, writes the empty output string to the file, and exits successfully. Impact: automation expecting JSON/SARIF can receive empty or human-formatted output without any failure signal. Fix by making the match exhaustive for table, json, and sarif, returning an error for anything else (or use clap value_parser), and add negative CLI tests for invalid --out-format.

cursor · 2026-05-28T16:43:23Z

+    }
+}
+
+fn should_fail(inv: &crate::deps::Inventory, threshold: &str) -> bool {


Invalid severity thresholds disable the failure gate instead of failing the command. Evidence: Severity::parse(threshold) returning None makes should_fail return false, so corgea deps scan --fail-on hihg exits 0 even when high findings exist. Impact: a typo in CI silently bypasses the intended dependency policy gate. Fix by validating --fail-on during argument handling (or returning Result<bool, DepsError> here) and exiting non-zero with a clear error for unsupported severities; add a CLI test for an invalid threshold.

cursor · 2026-05-28T16:43:23Z

+            for c in &diff.changed {
+                println!("  ~ {} {} -> {}", c.name, c.from, c.to);
+            }
+            if fail_on_new.is_some() && !head.findings.is_empty() {


--fail-on-new ignores both "new" and the supplied threshold. Evidence: the value is only checked with is_some(), and the exit decision uses !head.findings.is_empty() instead of findings introduced by the diff and at/above the requested severity. Impact: once the base diff is implemented, this will still fail builds for pre-existing findings below the requested threshold and will also accept invalid threshold strings. Fix by parsing fail_on_new as a severity, comparing base vs head findings to identify newly introduced findings, and failing only for new findings meeting that threshold.

cursor · 2026-05-28T16:43:23Z

+                }
+            }
+        }
+        if let Some(express) = packages.get("node_modules/express") {


The npm graph/explain implementation is fixture-specific: it only reads dependency declarations from the root package and node_modules/express. Evidence: dependencies of any other package in package-lock.json never get declared or parent set, and the transitive edge builder only adds edges when lp.parent is present. Impact: corgea deps graph and corgea deps explain are wrong for almost every npm project whose transitive dependencies are not under Express; affected transitive packages appear as disconnected or as fallback root paths. Fix by iterating every lockfile package entry, resolving each entry's dependencies map to node_modules/<dep> keys, and adding edges/declared constraints generically; add a fixture where a non-Express direct dependency has a transitive dependency and assert explain shows the real path.

cursor · 2026-05-28T16:43:23Z

+            },
+            lock_integrity: None,
+        });
+        if name == "urllib3" {


The Poetry graph has the same hard-coded fixture behavior: only urllib3 is connected as a transitive dependency of requests. Evidence: parse_poetry_lock returns only package names/versions, and this branch creates an edge only when name == "urllib3"; every other [package.dependencies] relationship in a real poetry.lock is ignored. Impact: corgea deps graph/explain cannot accurately explain Python dependency paths, and policy/path reporting will be misleading outside the one test fixture. Fix by parsing dependency tables for every package in poetry.lock and building edges from each package to its declared dependencies; add tests with at least two different parent packages and transitive dependencies.

cursor · 2026-05-28T16:48:45Z

+            out_format,
+            out_file,
+        } => {
+            let inv = scan(Path::new(&path), &Policy::default())?;


Blocking: scans never load the policy file that policy init creates. This path passes Policy::default() directly, and the same pattern is used for graph, explain, diff, and sbom, while Policy::from_yaml is never called by the CLI.

Impact: edits to .corgea/deps.yml have no effect, so users cannot disable or tune DEP rules and CI can fail on findings that policy explicitly allowed.

Fix: load <scan path>/.corgea/deps.yml when present, pass that Policy to every scan command, return an error for invalid YAML, and add a CLI test where disabling DEP004 changes the scan output/exit code.

cursor · 2026-05-28T16:48:45Z

+            for c in &diff.changed {
+                println!("  ~ {} {} -> {}", c.name, c.from, c.to);
+            }
+            if fail_on_new.is_some() && !head.findings.is_empty() {


Blocking: --fail-on-new does not compare against the base inventory and ignores the supplied severity threshold. As written, any non-empty head.findings exits 1, including pre-existing findings and findings below the requested threshold.

Impact: this cannot be used as a CI gate for “new high findings only”; existing low/medium findings in HEAD will break builds, while the option value is effectively unused.

Fix: parse the option into Severity, diff stable finding keys against base_inv.findings, fail only for new findings at or above the threshold, and reject invalid threshold values.

cursor · 2026-05-28T16:48:45Z

+}
+
+fn should_fail(inv: &crate::deps::Inventory, threshold: &str) -> bool {
+    let Some(sev) = Severity::parse(threshold) else {


Blocking: an invalid --fail-on value silently disables the gate. If Severity::parse returns None, should_fail returns false, so corgea deps scan --fail-on hihg exits 0 even when high/critical findings exist.

Impact: a typo in CI configuration bypasses the dependency policy gate.

Fix: validate --fail-on in run_inner and return an error/non-zero exit for unknown severities; add a CLI test for an invalid threshold.

cursor · 2026-05-28T16:48:45Z

+    inv.findings.iter().any(|f| f.severity.at_least(sev))
+}
+
+fn scan_base_ref(_path: &str, _base: &str) -> Result<crate::deps::Inventory, DepsError> {


Blocking: deps diff --base never scans the requested base ref. scan_base_ref ignores both arguments and returns an empty Inventory, so the diff always compares HEAD against an empty graph.

Impact: every current dependency is reported as added, removals and version changes from the real base are missed, and any CI behavior built on this diff is incorrect.

Fix: materialize the requested git ref in a temporary tree/worktree and scan that path, returning an error if the ref cannot be loaded; add a CLI test where a dependency changes or is removed between base and HEAD.

cursor · 2026-05-28T16:48:45Z

+    rel_manifest: &str,
+) -> Result<(), DepsError> {
+    let pkg = read_json(manifest_path)?;
+    let lock_path = file_in_dir(ctx.detected, dir, DepFileKind::NpmLockfile);


Blocking: detected Yarn and pnpm lockfiles are ignored. detect.rs classifies yarn.lock and pnpm-lock.yaml, but npm scanning only looks for DepFileKind::NpmLockfile here.

Impact: projects using Yarn or pnpm are scanned as if they have no usable lockfile: resolved versions/integrity are lost and direct dependencies can be falsely reported as stale/missing.

Fix: either parse YarnLockfile/PnpmLockfile as lock candidates or treat them as explicitly unsupported with a clear finding/error instead of silently ignoring them; add fixtures for both formats.

cursor · 2026-05-28T16:48:45Z

+        if !key.starts_with("node_modules/") {
+            continue;
+        }
+        let name = key


Blocking: scoped npm package names are corrupted for transitive lockfile entries. For a key like node_modules/@types/node, this extraction keeps only node, which then creates pkg:npm/node@... and collides with the unscoped node package.

Impact: the inventory, SBOM, findings lookup, and graph can all point at the wrong package for scoped dependencies.

Fix: when stripping node_modules/, preserve two path segments for scoped packages (@scope/name), and mirror that behavior in PackageId::name(); add a lockfile fixture containing a scoped transitive package.

cursor · 2026-05-28T16:48:45Z

+                }
+            }
+        }
+        if let Some(express) = packages.get("node_modules/express") {


Blocking: npm dependency edges are hardcoded to express. parse_npm_lock only copies dependency declarations from the root package and node_modules/express; every other package’s dependencies object is ignored, leaving lp.parent unset and no edge emitted.

Impact: deps explain paths and CycloneDX dependency relationships are wrong for almost all real npm lockfiles.

Fix: iterate every packages entry, read its dependencies, resolve each child package key, and emit parent-to-child edges generically; remove the express special case and test with a non-Express transitive dependency.

cursor · 2026-05-28T16:48:45Z

+            let dir = parent_dir(&f.path);
+            let has_lock = ctx.detected.iter().any(|x| {
+                parent_dir(&x.path) == dir
+                    && matches!(x.kind, DepFileKind::PoetryLock | DepFileKind::UvLock)


Blocking: uv.lock suppresses requirements scanning but is never parsed. This branch treats UvLock as a lockfile, so requirements.txt is skipped, and there is no scanner elsewhere that consumes DepFileKind::UvLock.

Impact: Python projects with uv.lock plus requirements can produce an empty or severely incomplete inventory and miss DEP findings entirely.

Fix: implement uv.lock parsing before treating it as handled, or stop including UvLock in this has_lock check until support exists; add a fixture for pyproject.toml + uv.lock + requirements.txt.

juangaitanv closed this May 28, 2026

juangaitanv reopened this May 28, 2026

cursor Bot requested changes May 28, 2026

View reviewed changes

Conversation

juangaitanv commented May 28, 2026

Summary

Out of scope (chunk 3)

Coverage

Test plan

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cursor Bot left a comment •

edited

Loading