fix(sandbox): canonicalize HTTP request-targets before L7 policy evaluation by johntmyers · Pull Request #878 · NVIDIA/OpenShell

johntmyers · 2026-04-17T20:15:43Z

Summary

The L7 REST proxy evaluates OPA path rules against the raw HTTP request-target and forwards the raw header block byte-for-byte to the upstream. The upstream normalizes .., %2e%2e, and // before dispatching, so a compromised agent inside the sandbox can bypass any path-based allow rule — e.g. GET /public/../secret matches /public/** at the proxy but the upstream serves /secret.
The inference-routing detection (normalize_inference_path) had the same normalization gap: it only stripped scheme+authority and preserved dot-segments verbatim, so the same class of payload could also dodge inference routing.
This PR introduces a single URL-path canonicalizer used at every L7 enforcement site. The canonical path is both the input to OPA and the bytes written onto the wire — closing the parser-differential between the policy engine and the upstream.

Related Issue

OS-99

Changes

crates/openshell-sandbox/src/l7/path.rs (new)
- canonicalize_request_target(target, opts) -> Result<(CanonicalPath, Option<String> /* raw query */), CanonicalizeError>.
- Percent-decodes (uppercase hex on re-emit), resolves . / .. per RFC 3986 5.2.4 (rejects underflow rather than silently clamping to /), collapses doubled slashes, strips trailing ;params, handles origin-form and absolute-form targets, enforces a 4 KiB length bound, rejects fragments / raw non-ASCII / control bytes / encoded slashes (unless explicitly opted in per-endpoint via allow_encoded_slash).
- Returns a rewritten flag so callers know whether the outbound request line needs to be rebuilt.
crates/openshell-sandbox/src/l7/rest.rs
- parse_http_request canonicalizes the request-target before returning the L7Request. If the canonical form differs from the raw input, the request line in raw_header is rebuilt via rewrite_request_line_target so the upstream dispatches on the same bytes the policy evaluated.
- parse_query_params is now pub(crate) so proxy.rs can reuse it.
crates/openshell-sandbox/src/proxy.rs
- The forward (plain HTTP proxy) L7 evaluation site canonicalizes the request-target. Non-canonical input is denied with 400 Bad Request and an OCSF NetworkActivity Fail event instead of being silently passed through.
- normalize_inference_path is now a thin wrapper over the canonicalizer (falls back to the raw path on canonicalization error so the normal forward path can reject properly).
crates/openshell-sandbox/data/sandbox-policy.rego
- Comment documenting the invariant: input.request.path is pre-canonicalized, so rules must not attempt defensive matching against .. or %2e%2e.
architecture/sandbox.md
- Adds the new l7/path.rs module to the component table.

Testing

cargo test -p openshell-sandbox --lib — 513 tests pass, including 24 new tests in l7::path::tests (dot segments, percent-encoded dot segments, double-slash collapse, encoded-slash reject/opt-in, null/control byte rejection, fragment rejection, absolute-form stripping, legitimate %20 / %40 / %C3%A9 round-tripping, mixed-case percent normalization, length bound, non-ASCII rejection, query-suffix separation, rewritten-flag semantics).
cargo test -p openshell-sandbox — 518 tests across all targets pass.
mise run pre-commit — passes (lint, format, license headers, rust tests).
Regression tests for the documented bypass payloads:
- /public/../secret → /secret
- /public/%2e%2e/secret → /secret
- /public/%2E%2E%2Fsecret → EncodedSlashNotAllowed
- //public/../secret → /secret
- /public/./../secret → /secret
- /public/%00/secret → NullOrControlByte
- /.. → TraversalAboveRoot
Legitimate traffic continues to pass: /files/hello%20world.txt, /users/caf%C3%A9, /v1/chat/completions?stream=true, paths with : and @ in segments.

Trade-offs

%2F inside a segment is rejected by default. Operators who need it for artifact-registry-style APIs can opt in per-endpoint via allow_encoded_slash: true (the plumbing is in place — endpoint config wire-up can land in a follow-up once there's a consumer).
Non-canonical requests are always hard-rejected regardless of EnforcementMode::Audit — audit mode is for policy-tuning, not for tolerating protocol violations.
%20 and similar legitimate percent-encoded bytes are preserved in the canonical form (only unreserved pchars get decoded), so policy patterns that currently match /files/hello%20world.txt continue to work.

Checklist

Conventional Commits format
No secrets or credentials in diff
Scoped to the issue at hand (no unrelated refactors)
Architecture and rego documentation updated

…uation The L7 REST proxy evaluated OPA path rules against the raw request-target and forwarded the raw header block byte-for-byte to the upstream. The upstream normalized `..`, `%2e%2e`, and `//` before dispatching, so a compromised agent inside the sandbox could bypass any path-based allow rule by prefixing the blocked path with an allowed one (e.g. `GET /public/../secret` matches `/public/**` at the proxy but the upstream serves `/secret`). The inference-routing detection had the same normalization gap via `normalize_inference_path`, which only stripped scheme+authority and preserved dot-segments verbatim. - Add `crates/openshell-sandbox/src/l7/path.rs` with `canonicalize_request_target`. Percent-decodes (with uppercase hex re-emission), resolves `.` / `..` segments per RFC 3986 5.2.4, collapses doubled slashes, strips trailing `;params`, rejects fragments / raw non-ASCII / control bytes / encoded slashes (unless explicitly enabled per-endpoint), and returns a `rewritten` flag so callers know when to rewrite the outbound request line. - Wire the canonicalizer into `rest.rs::parse_http_request`. The canonical path is the `target` on `L7Request` that OPA sees, and when the canonical form differs from the raw input the request line is rebuilt in `raw_header` so the upstream dispatches on the exact bytes the policy evaluated. - Canonicalize the forward (plain HTTP proxy) path at `proxy.rs`'s second L7 evaluation site. A non-canonical request-target is rejected with 400 Bad Request and an OCSF `NetworkActivity` Fail event rather than silently passed through. - Replace the old `normalize_inference_path` body with a call to the canonicalizer. On canonicalization error the raw path is returned (so inference detection simply misses and the normal forward path handles and rejects). - Document the invariant in `sandbox-policy.rego`: `input.request.path` is pre-canonicalized so rules must not attempt defensive matching against `..` or `%2e%2e`. - Architecture doc (`architecture/sandbox.md`) lists the new module. - 24 new unit tests cover dot segments, percent-encoded dot segments, double-slash collapse, encoded-slash reject/opt-in, null/control byte rejection, legitimate percent-encoded bytes round-tripping, mixed-case percent normalization, fragment rejection, absolute-form stripping, empty / length-bounded / non-ASCII handling, and regression payloads for the specific bypasses above. Tracks OS-99.

copy-pr-bot · 2026-04-17T20:15:46Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

mjamiv · 2026-04-19T03:30:32Z

+1 on security merit. Path-based OPA rules are load-bearing for binary-scoped network policies in real deployments, and .. / %2e%2e / // bypass on the L7 path is an active concern for anyone using path-level egress allowlisting on inference proxies. Worth landing.

johntmyers requested a review from a team as a code owner April 17, 2026 20:15

johntmyers added topic:l7 Application-layer policy and inspection work topic:security Security issues labels Apr 17, 2026

johntmyers self-assigned this Apr 17, 2026

johntmyers added the test:e2e Requires end-to-end coverage label Apr 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sandbox): canonicalize HTTP request-targets before L7 policy evaluation#878

fix(sandbox): canonicalize HTTP request-targets before L7 policy evaluation#878
johntmyers wants to merge 1 commit intomainfrom
fix/l7-path-canonicalization

johntmyers commented Apr 17, 2026

Uh oh!

copy-pr-bot bot commented Apr 17, 2026

Uh oh!

mjamiv commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

johntmyers commented Apr 17, 2026

Summary

Related Issue

Changes

Testing

Trade-offs

Checklist

Uh oh!

copy-pr-bot bot commented Apr 17, 2026

Uh oh!

mjamiv commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants