fix(sandbox): canonicalize HTTP request-targets before L7 policy evaluation#878
Open
johntmyers wants to merge 1 commit intomainfrom
Open
fix(sandbox): canonicalize HTTP request-targets before L7 policy evaluation#878johntmyers wants to merge 1 commit intomainfrom
johntmyers wants to merge 1 commit intomainfrom
Conversation
…uation The L7 REST proxy evaluated OPA path rules against the raw request-target and forwarded the raw header block byte-for-byte to the upstream. The upstream normalized `..`, `%2e%2e`, and `//` before dispatching, so a compromised agent inside the sandbox could bypass any path-based allow rule by prefixing the blocked path with an allowed one (e.g. `GET /public/../secret` matches `/public/**` at the proxy but the upstream serves `/secret`). The inference-routing detection had the same normalization gap via `normalize_inference_path`, which only stripped scheme+authority and preserved dot-segments verbatim. - Add `crates/openshell-sandbox/src/l7/path.rs` with `canonicalize_request_target`. Percent-decodes (with uppercase hex re-emission), resolves `.` / `..` segments per RFC 3986 5.2.4, collapses doubled slashes, strips trailing `;params`, rejects fragments / raw non-ASCII / control bytes / encoded slashes (unless explicitly enabled per-endpoint), and returns a `rewritten` flag so callers know when to rewrite the outbound request line. - Wire the canonicalizer into `rest.rs::parse_http_request`. The canonical path is the `target` on `L7Request` that OPA sees, and when the canonical form differs from the raw input the request line is rebuilt in `raw_header` so the upstream dispatches on the exact bytes the policy evaluated. - Canonicalize the forward (plain HTTP proxy) path at `proxy.rs`'s second L7 evaluation site. A non-canonical request-target is rejected with 400 Bad Request and an OCSF `NetworkActivity` Fail event rather than silently passed through. - Replace the old `normalize_inference_path` body with a call to the canonicalizer. On canonicalization error the raw path is returned (so inference detection simply misses and the normal forward path handles and rejects). - Document the invariant in `sandbox-policy.rego`: `input.request.path` is pre-canonicalized so rules must not attempt defensive matching against `..` or `%2e%2e`. - Architecture doc (`architecture/sandbox.md`) lists the new module. - 24 new unit tests cover dot segments, percent-encoded dot segments, double-slash collapse, encoded-slash reject/opt-in, null/control byte rejection, legitimate percent-encoded bytes round-tripping, mixed-case percent normalization, fragment rejection, absolute-form stripping, empty / length-bounded / non-ASCII handling, and regression payloads for the specific bypasses above. Tracks OS-99.
Contributor
|
+1 on security merit. Path-based OPA rules are load-bearing for binary-scoped network policies in real deployments, and |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
..,%2e%2e, and//before dispatching, so a compromised agent inside the sandbox can bypass any path-based allow rule — e.g.GET /public/../secretmatches/public/**at the proxy but the upstream serves/secret.normalize_inference_path) had the same normalization gap: it only stripped scheme+authority and preserved dot-segments verbatim, so the same class of payload could also dodge inference routing.Related Issue
OS-99
Changes
crates/openshell-sandbox/src/l7/path.rs(new)canonicalize_request_target(target, opts) -> Result<(CanonicalPath, Option<String> /* raw query */), CanonicalizeError>../..per RFC 3986 5.2.4 (rejects underflow rather than silently clamping to/), collapses doubled slashes, strips trailing;params, handles origin-form and absolute-form targets, enforces a 4 KiB length bound, rejects fragments / raw non-ASCII / control bytes / encoded slashes (unless explicitly opted in per-endpoint viaallow_encoded_slash).rewrittenflag so callers know whether the outbound request line needs to be rebuilt.crates/openshell-sandbox/src/l7/rest.rsparse_http_requestcanonicalizes the request-target before returning theL7Request. If the canonical form differs from the raw input, the request line inraw_headeris rebuilt viarewrite_request_line_targetso the upstream dispatches on the same bytes the policy evaluated.parse_query_paramsis nowpub(crate)soproxy.rscan reuse it.crates/openshell-sandbox/src/proxy.rsNetworkActivityFail event instead of being silently passed through.normalize_inference_pathis now a thin wrapper over the canonicalizer (falls back to the raw path on canonicalization error so the normal forward path can reject properly).crates/openshell-sandbox/data/sandbox-policy.regoinput.request.pathis pre-canonicalized, so rules must not attempt defensive matching against..or%2e%2e.architecture/sandbox.mdl7/path.rsmodule to the component table.Testing
cargo test -p openshell-sandbox --lib— 513 tests pass, including 24 new tests inl7::path::tests(dot segments, percent-encoded dot segments, double-slash collapse, encoded-slash reject/opt-in, null/control byte rejection, fragment rejection, absolute-form stripping, legitimate%20/%40/%C3%A9round-tripping, mixed-case percent normalization, length bound, non-ASCII rejection, query-suffix separation, rewritten-flag semantics).cargo test -p openshell-sandbox— 518 tests across all targets pass.mise run pre-commit— passes (lint, format, license headers, rust tests)./public/../secret→/secret/public/%2e%2e/secret→/secret/public/%2E%2E%2Fsecret→EncodedSlashNotAllowed//public/../secret→/secret/public/./../secret→/secret/public/%00/secret→NullOrControlByte/..→TraversalAboveRoot/files/hello%20world.txt,/users/caf%C3%A9,/v1/chat/completions?stream=true, paths with:and@in segments.Trade-offs
%2Finside a segment is rejected by default. Operators who need it for artifact-registry-style APIs can opt in per-endpoint viaallow_encoded_slash: true(the plumbing is in place — endpoint config wire-up can land in a follow-up once there's a consumer).EnforcementMode::Audit— audit mode is for policy-tuning, not for tolerating protocol violations.%20and similar legitimate percent-encoded bytes are preserved in the canonical form (only unreserved pchars get decoded), so policy patterns that currently match/files/hello%20world.txtcontinue to work.Checklist