Skip to content

docs(k8s-proxy): add Kubernetes Proxy REST API and DaemonSet recording mode#838

Merged
khareyash05 merged 7 commits intomainfrom
docs/k8s-proxy-api-and-static-dedup
Apr 27, 2026
Merged

docs(k8s-proxy): add Kubernetes Proxy REST API and DaemonSet recording mode#838
khareyash05 merged 7 commits intomainfrom
docs/k8s-proxy-api-and-static-dedup

Conversation

@officialasishkumar
Copy link
Copy Markdown
Member

@officialasishkumar officialasishkumar commented Apr 23, 2026

What has changed?

Adds the v4 Kubernetes Proxy REST API reference, documents DaemonSet as a first-class recording mode alongside Sidecar in the K8s Live Record & Replay quickstart (v3 + v4), and lands the Vale wiring needed to keep both pages green in CI.

1. New page: versioned_docs/version-4.0.0/running-keploy/k8s-proxy-api.md (+481)

Full REST API reference for the in-cluster keploy/k8s-proxy service, structured to mirror the existing Public REST API guide. Covers:

  • Why the proxy — zero-touch webhook-based agent injection, one shared-token endpoint per Deployment, namespace scoping, durable session storage, auto-replay loop, GitOps-aware self-update, and edge-side static deduplication.
  • Recording modes — Sidecar (default, agent injected via MutatingAdmissionWebhook) and DaemonSet (per-node Pod, eBPF capture, scoped by a RecordingSession CR, no application-Pod mutation, supports cluster-mode auto-replay). Both modes expose the same REST API documented on the page.
  • Authentication — how the 32-byte crypto/rand shared token is generated on every proxy Pod start, heartbeated to the Keploy API server, and retrieved by callers via JWT → /cluster/getApp?namespace=&deployment=&clusterName= (or /cluster/getApps for a fleet-wide read).
  • Response format with concrete success / 400 / 401 / 403 examples, plus an HTTP error-code table (400, 401, 403, 404, 405, 500, 503).
  • Quick start that walks /deployments/record/start (with record_config: static_dedup, enable_sampling, filters) → NDJSON /record/status/record/stop.
  • Recording configuration field tables for record_config (filters, client_key, enable_sampling, static_dedup, custom_dedup_fields, low_latency_mode, debug, memory_limit with the 2× memory-request → memory-limit rule, secret_protection), the per-Filter shape, and Auto-replay configuration (autoReplayInterval, mongoPassword, apiTimeout, delay, globalNoise, envOverrides).
  • Endpoint reference organised by domain: health & admission, deployments, recording, replay/test, the API-server-compatible data routes mounted under /k8s-proxy/* (testcases, mocks/mappings, reports & schema coverage, saved config), the Assertion-Test Generator (ATG) routes, and proxy self-management (/proxy/update, /proxy/update/status, /proxy/shutdown, proxy log endpoints, auto-replay debug bundles).
  • Namespace scoping rules for watchNamespace and the 403 cross-namespace contract.
  • Related guides cross-links to Static Deduplication, keploy dedup, the Public REST API, the Kubernetes installation guide, and the GitOps with Argo CD page.

versioned_sidebars/version-4.0.0-sidebars.json wires the new page into the API Reference group right next to running-keploy/public-api.

2. Quickstart: document DaemonSet recording mode (v3 + v4, +46/-0 each)

versioned_docs/version-{3,4}.0.0/quickstart/k8s-proxy.md previously only walked the Sidecar flow. Two additions, applied to both versions (the files were identical pre-edit):

  • "Pick a recording mode" section right above step 1, with a comparison table of when to choose Sidecar vs DaemonSet (capture mechanism, behavior on Start Recording, whether application Pods are mutated, whether they restart at recording start, and when each mode is the right fit).
  • "DaemonSet mode (optional)" subsection under step 4, with the Helm flags (daemonset.enabled=true, daemonset.crds.install=true) and the verification commands (kubectl get pods -n keploy, kubectl get crd | grep keploy.io).

Both modes drive the identical Console UI / REST API, so the existing screenshots and the rest of the quickstart stay correct verbatim — only the install command branches.

3. CI fixes for the new content

  • .vale.ini: disable Google.Units sitewide. The rule's \d+(?:s|ms|ns|min|h|d) regex matches 8s inside the token k8s (and k3s/k0s/…), and every K8s Proxy quickstart screenshot URL contains k8s-proxy/k8s_…. With filter_mode: diff_context, any diff hunk that reaches into those lines turns every K8s image src into a Vale error. Disabling is consistent with the other Google.* overrides already in .vale.ini (PassiveVoice/We/Will/Exclamation/Ellipses/Latin).
  • vale_styles/config/vocabularies/Base/accept.txt: add the domain terms used in the new prose — IPs, [Dd]edups, [Rr]ollout[s]?, [Pp]refill[s]?, [Aa]uditable, [Cc]ooldown, [Ll]iveness, MongoIDs, initialised, [Dd]aemon[Ss]et[s]?, [Cc]RD[s]?, eBPF, [Mm]utatingAdmissionWebhook, RecordingSession[s]?, ReplaySession[s]?, keploy-daemonset, keploy-agent, recordingsessions, replaysessions.
  • Em-dashes around the new prose stripped of surrounding spaces to satisfy Google.EmDash; the new quickstart sections re-flowed with Prettier 3.8.3 (api.md was already prettier-clean).

The original scope of this PR also included versioned_docs/version-4.0.0/keploy-cloud/static-deduplication.md. That page landed separately via #841, so this PR now ships only the K8s Proxy REST API reference, the Sidecar/DaemonSet quickstart additions, and the supporting Vale wiring (6 files, +595/-1).

Resolves keploy/enterprise#1919.

Type of change

  • New feature (non-breaking change which adds functionality).
  • Documentation update (if none of the other choices apply).

How Has This Been Tested?

  • npm install && npm run build completes cleanly against the new files.
  • Vale 3.0.3 run locally against k8s-proxy-api.md and the changed quickstart files: 0 errors, 0 warnings, 0 suggestions.
  • Prettier 3.8.3 --check clean on every changed Markdown file.
  • JSON validated for versioned_sidebars/version-4.0.0-sidebars.json.
  • Every documented endpoint, query parameter, response shape, error code, RBAC claim, and RecordConfig/AutoReplayConfig field was cross-checked against the current keploy/k8s-proxy and keploy/api-server source on main before publishing. The Helm flags and CRD names in the DaemonSet quickstart section were cross-checked against the keploy/k8s-proxy chart values.

Checklist:

  • My code follows the style guidelines of this project.
  • I have performed a self-review of my own code.

@officialasishkumar officialasishkumar force-pushed the docs/k8s-proxy-api-and-static-dedup branch 2 times, most recently from 437bb1e to 868ebee Compare April 23, 2026 12:02
Adds versioned_docs/version-4.0.0/running-keploy/k8s-proxy-api.md, a
full REST API guide for the keploy/k8s-proxy service in the v4 docs.
Mirrors running-keploy/public-api.md: authentication, response format,
quickstart, endpoint reference, and related guides.

Covers the control-plane surface exposed behind a cluster-wide shared
bearer token: recording start/stop/status, replay, reports, schema
coverage, auto-replay config, ATG sandbox routes, /k8s-proxy/* data
routes, and proxy self-update. Calls out the unique benefits versus
running the enterprise CLI directly (zero-touch webhook injection,
one API per Deployment, namespace scoping, durable session/log state,
self-updating image, static dedup at the edge).

Authentication section explains that the token is generated at proxy
Pod startup via crypto/rand, reported to the api-server through the
heartbeat channel, and retrieved by callers via the api-server route
GET /cluster/getApp (or /cluster/getApps). The shared token is not
sourced from a Helm Secret or env var, and it rotates on every Pod
restart, so programmatic callers must re-fetch it before each run.

This commit replaces the earlier draft of this file from the original
PR #838 commits, which described a DaemonSet capture mode and a
RecordingSession custom resource that do not exist in the keploy/
k8s-proxy code, plus a /k8s-proxy/mode route, a /k8s-proxy/apps/ensure
route, and a Helm-based shared-token Secret recipe that the chart does
not provision.

Also adds the page to the v4 API Reference sidebar and extends the
Vale base vocabulary with the domain terms used in the new guide.

Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
@officialasishkumar officialasishkumar force-pushed the docs/k8s-proxy-api-and-static-dedup branch from 0d1be20 to d780b16 Compare April 27, 2026 06:22
@officialasishkumar
Copy link
Copy Markdown
Member Author

Force-pushed a single squashed commit (d780b16) replacing the original four commits. Highlights of the rewrite:

Squash + rebase

  • Rebased onto current main, which dropped the redundant static-deduplication.md (already merged via docs: add Static Deduplication guide #841) and resolved the accept.txt and sidebars conflicts.
  • PR is back to MERGEABLE. Net diff is now +481 / -1 across versioned_docs/version-4.0.0/running-keploy/k8s-proxy-api.md (new), the v4 sidebar, and the Vale base vocabulary.

Authentication section rewrite (was the biggest accuracy bug)

  • Removed the recipe that read <release-fullname>-shared-token Secret and KEPLOY_SHARED_TOKEN env var. The Helm chart does not provision either; the only secrets it creates are keploy-credentials (access-key) and mongodb-credentials. Auditing against keploy/k8s-proxy@048cc85, cmd/k8s/main.go:144-148 shows the token is generated at proxy Pod startup via crypto/rand and reported to api-server through the heartbeat in pkg/service/cluster/service.go:485.
  • Replaced with the actual retrieval flow: log in to api-server, then GET /cluster/getApp?namespace=&deployment=&clusterName= (or /cluster/getApps) returns sharedToken directly from latestHeartbeat.SharedToken. Routes are mounted in keploy/api-server pkg/http/routes.go:455-456 and the response shape is in pkg/http/cluster.go:389,447,520.
  • Added a paragraph explaining why the token rotates on Pod restart and why programmatic callers must re-fetch.

Stripped fabricated DaemonSet / CRD claims

  • Removed every "In DaemonSet mode" sentence and the RecordingSession custom resource claim. The repo only has sidecar/webhook injection; the in-tree comment at keploy/k8s-proxy tests/e2e/istio-graceful-attach/manifests/recordingsession.yaml:1 literally states "RecordingSession is NOT a Kubernetes CRD in this codebase."
  • Dropped the daemonset-mode record-start response example. handleRecordStart at pkg/http/handlers.go:2082 only ever returns {"record":"started","id":"<ns>-<dep>"}.

Removed two non-existent routes from the endpoint table

  • GET /k8s-proxy/mode (no capture_mode literal anywhere in the codebase). The "Capture mode and health" subsection is now just "Health".
  • POST /k8s-proxy/apps/ensure (not registered in pkg/http/routes.go:141-204).

What stays as-is (verified accurate against current k8s-proxy code)

  • All /k8s-proxy/* data routes, /agent/* ATG sandbox routes, /autoreplay/debug-bundles/* routes, /proxy/update*, /logs/proxy*, /record/*, /test/* route lists.
  • RecordConfig / Filter / RecordRequest / AutoReplayConfig / ReplayRequest field tables (matched against pkg/models/http.go:35-150).
  • 2x memory-limit rule for the injected sidecar (verified at pkg/service/webhook/service.go:884).
  • 401 / 403 envelope shapes, namespace-scoping behavior, /healthz response, /agent/run/{jobID} 30s default timeout.

CI is fully green: vale, lint, prettier, deploy-preview, DCO all pass on the new commit.

@officialasishkumar officialasishkumar changed the title docs: add Kubernetes Proxy REST API and Static Deduplication guides (v4) docs: add Kubernetes Proxy REST API Apr 27, 2026
Comment thread versioned_docs/version-4.0.0/running-keploy/k8s-proxy-api.md Outdated
khareyash05 added a commit that referenced this pull request Apr 27, 2026
Two CI checks were failing on PR #838 against the new DaemonSet
content:

- Vale: flagged daemonset/CRDs/eBPF/MutatingAdmissionWebhook/
  RecordingSession/keploy-{agent,daemonset} as misspellings, plus
  Google.EmDash on " — " (em-dash with surrounding spaces).
- Prettier 3.8.3: needed reflow on the new tables / sections in
  versioned_docs/version-{3,4}.0.0/quickstart/k8s-proxy.md.

Fix:
- Add the new technical terms to vale_styles/config/vocabularies/
  Base/accept.txt (case-folding patterns where it makes sense).
- Strip the spaces around em-dashes in the lines I introduced.
- Run prettier --write on the changed quickstart files; api.md was
  already prettier-clean.
…nly framing

The k8s-proxy guide implicitly assumed Sidecar mode and split the data
routes by SaaS vs self-hosted. Two updates:

1. New "Recording modes" section right after the Base URL: state that
   the proxy supports Sidecar (default — agent injected via
   MutatingAdmissionWebhook) AND DaemonSet (eBPF capture from a
   per-node Pod, scoped by a RecordingSession CR, no application-Pod
   mutation, supports cluster-mode auto-replay). Both modes expose the
   same REST API documented here, so the rest of the guide stays
   correct without per-mode forks.

2. Drop "in SaaS mode … in self-hosted mode" framing on the data-route
   intro and the API_SERVER comment. The proxy serves /k8s-proxy/*
   directly in both deployment shapes; the SaaS/self-hosted split was
   adding noise without helping the reader pick a path.

Signed-off-by: Yash Khare <khareyash05@gmail.com>
The K8s Live Record & Replay quickstart only walked through Sidecar
mode (default — agent injected via MutatingAdmissionWebhook). Add a
"Pick a recording mode" section right above step 1 with a comparison
table covering when to choose Sidecar vs DaemonSet, and a "DaemonSet
mode (optional)" subsection under step 4 with the Helm flags to
enable it (`daemonset.enabled=true`, `daemonset.crds.install=true`)
plus the verification commands (`kubectl get pods -n keploy`,
`kubectl get crd | grep keploy.io`).

Both modes drive the identical Console UI / REST API, so the
existing screenshots and the rest of the quickstart stay correct
verbatim — only the install command branches.

Applied to both v3.0.0 and v4.0.0 quickstart files since they were
identical pre-edit.

Signed-off-by: Yash Khare <khareyash05@gmail.com>
Two CI checks were failing on PR #838 against the new DaemonSet
content:

- Vale: flagged daemonset/CRDs/eBPF/MutatingAdmissionWebhook/
  RecordingSession/keploy-{agent,daemonset} as misspellings, plus
  Google.EmDash on " — " (em-dash with surrounding spaces).
- Prettier 3.8.3: needed reflow on the new tables / sections in
  versioned_docs/version-{3,4}.0.0/quickstart/k8s-proxy.md.

Fix:
- Add the new technical terms to vale_styles/config/vocabularies/
  Base/accept.txt (case-folding patterns where it makes sense).
- Strip the spaces around em-dashes in the lines I introduced.
- Run prettier --write on the changed quickstart files; api.md was
  already prettier-clean.

Signed-off-by: Yash Khare <khareyash05@gmail.com>
Google.Units rule fires `\d+(?:s|ms|ns|min|h|d)` with `nonword: true`,
which matches `8s` inside the token `k8s` (and `k3s`, `k0s`, ...).
On the K8s Proxy quickstart page every screenshot URL contains
`k8s-proxy/k8s_…`, so reviewdog with `filter_mode: diff_context`
fails the PR even though the offending lines are pre-existing — once
a diff hunk reaches into them, every k8s image src becomes an error.

Disabling the rule sitewide is consistent with the other Google.*
overrides already in .vale.ini (PassiveVoice/We/Will/Exclamation/
Ellipses/Latin all set NO). For a Kubernetes-heavy docs site the
unit-spacing check is more cost than benefit.

Signed-off-by: Yash Khare <khareyash05@gmail.com>
@khareyash05 khareyash05 force-pushed the docs/k8s-proxy-api-and-static-dedup branch from a56f395 to 7024a2f Compare April 27, 2026 12:04
Copy link
Copy Markdown
Member

@khareyash05 khareyash05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@officialasishkumar officialasishkumar changed the title docs: add Kubernetes Proxy REST API docs(k8s-proxy): add Kubernetes Proxy REST API and DaemonSet recording mode Apr 27, 2026
Two corrections to the K8s Proxy REST API guide after auditing it
against the actual k8s-proxy code:

1. Token provisioning. The doc claimed the proxy generates a 32-byte
   random value via crypto/rand on every Pod start and rotates the
   token whenever the Pod restarts. That is the **fallback** path
   in cmd/k8s/main.go (taken only when KEPLOY_SHARED_TOKEN is unset)
   — the real production path is:
     - Helm chart (charts/k8s-proxy/templates/shared-token-secret.yaml)
       generates the value once via `randAlphaNum 48` on first install.
     - Stored in a Secret `<release>-shared-token`, annotated with
       `helm.sh/resource-policy: keep` and backed by `lookup` so it
       survives upgrades.
     - k8s-proxy Deployment + DaemonSet both mount it as
       KEPLOY_SHARED_TOKEN via `secretKeyRef`.
     - Token is therefore **stable for the lifetime of the Helm
       release**, NOT per-Pod.
     - Proxy still reports the value to the API server in its first
       heartbeat (POST /cluster/status), so the existing
       /cluster/getApp retrieval flow stays accurate.
   Rewrote "How the token is provisioned" to describe the Helm path
   first and the random fallback second, and added a "(a) Read it
   directly from the Secret" path under "Retrieve the token" since
   that is often the fastest option for operators with kubectl
   access. The closing note no longer claims the token rotates on
   Pod restart.

2. Removed the closing line in §"Why the Kubernetes Proxy" that
   pointed users to the Public REST API as a single-app alternative.

Signed-off-by: Yash Khare <khareyash05@gmail.com>
Two errors that landed on PR #838 after the shared-token correction:

- Google.EmDash on the new sentence in §"How the token is provisioned"
  (" — Pod restarts and chart upgrades do not rotate it"). Strip the
  surrounding spaces around the em-dash to match the rule.
- Vale.Spelling on "heartbeated" — not in the dictionary and reads
  awkwardly anyway. Reword to "what the proxy reported in its last
  heartbeat".

Signed-off-by: Yash Khare <khareyash05@gmail.com>
@khareyash05 khareyash05 merged commit c352a42 into main Apr 27, 2026
7 checks passed
@khareyash05 khareyash05 deleted the docs/k8s-proxy-api-and-static-dedup branch April 27, 2026 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants