fix(sap-demo-java): resolve three runtime bugs found during Keploy record/replay validation#127
fix(sap-demo-java): resolve three runtime bugs found during Keploy record/replay validation#127slayerjain wants to merge 6 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes runtime issues in the sap-demo-java sample that were surfaced during Keploy record/replay validation by making local defaults more Keploy-friendly, hardening optional SAP fan-out behavior, and correcting audit routing/404 handling.
Changes:
- Updated default Postgres JDBC URL to prefer IPv4 loopback and disable SSL negotiation by default.
- Increased SAP client and aggregate fan-out timeouts (env-overridable) and improved degradation behavior when resilience4j circuit breaker is open.
- Added
/api/v1/auditroute (keeping/api/v1/customers/recent-viewsas an alias) and introduced an explicitNoResourceFoundException→ 404 problem response mapping.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
sap-demo-java/src/main/resources/application.yml |
Adjusts default datasource URL and increases configurable SAP/aggregate timeout budgets. |
sap-demo-java/src/main/java/com/tricentisdemo/sap/customer360/web/GlobalExceptionHandler.java |
Adds explicit 404 handling for unmapped routes via NoResourceFoundException. |
sap-demo-java/src/main/java/com/tricentisdemo/sap/customer360/web/AuditController.java |
Makes /api/v1/audit a canonical endpoint and keeps legacy alias; adds defensive empty-list handling. |
sap-demo-java/src/main/java/com/tricentisdemo/sap/customer360/service/Customer360AggregatorService.java |
Catches circuit-breaker-open errors in optional fan-out and expands defensive exception handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // and surfaces as a 500 — confusing for clients probing nearby | ||
| // routes (e.g. GET /api/v1/audit when they meant | ||
| // /api/v1/customers/recent-views). | ||
| log.info("Unmapped route path={}", req.getRequestURI()); |
There was a problem hiding this comment.
Logging every unmapped route at INFO can generate a lot of noise in production (404s from bots, typos, etc.) and make it harder to spot real issues. Consider downgrading this to DEBUG (or sampling/rate-limiting) while still returning the 404 problem response to the client.
| log.info("Unmapped route path={}", req.getRequestURI()); | |
| if (log.isDebugEnabled()) { | |
| log.debug("Unmapped route path={}", req.getRequestURI()); | |
| } |
| // Circuit breaker open — degrade the optional view rather than | ||
| // propagating a 503 out of the aggregator. The partner call | ||
| // (mandatory) has its own circuit and is handled upstream. | ||
| log.warn("{} skipped: circuit breaker open ({})", what, circuit.getMessage()); |
There was a problem hiding this comment.
A circuit breaker being open is an expected steady-state during upstream outages; logging it at WARN for every optional fan-out call can flood logs under load. Consider lowering this to INFO/DEBUG and/or adding some form of throttling so the degradation path remains observable without being noisy.
| log.warn("{} skipped: circuit breaker open ({})", what, circuit.getMessage()); | |
| log.debug("{} skipped: circuit breaker open ({})", what, circuit.getMessage()); |
| } catch (Exception e) { | ||
| // Defensive: any checked exception that slipped through a | ||
| // supplier lambda (e.g. via sneaky-throws) should still degrade | ||
| // the optional fan-out, never 500 the parent request. | ||
| log.warn("{} failed ({}): {}", what, e.getClass().getSimpleName(), e.getMessage()); | ||
| return List.of(); |
There was a problem hiding this comment.
The catch (Exception e) branch duplicates the catch (RuntimeException e) logic, but Supplier.get() cannot throw checked exceptions in normal Java typing. If the intent is to support checked exceptions, consider using a dedicated functional interface that declares throws Exception; otherwise, dropping the extra catch will simplify the flow without changing behavior.
| } catch (Exception e) { | |
| // Defensive: any checked exception that slipped through a | |
| // supplier lambda (e.g. via sneaky-throws) should still degrade | |
| // the optional fan-out, never 500 the parent request. | |
| log.warn("{} failed ({}): {}", what, e.getClass().getSimpleName(), e.getMessage()); | |
| return List.of(); |
| @RestController | ||
| @RequestMapping("/api/v1/customers") | ||
| @Tag(name = "Audit", description = "Service-level access audit (Postgres read)") |
There was a problem hiding this comment.
After removing the class-level @RequestMapping, the org.springframework.web.bind.annotation.RequestMapping import becomes unused. Please remove the unused import to avoid warnings / failing builds when unused imports are disallowed.
| List<AuditEvent> events = audit.recent(); | ||
| if (events == null) { | ||
| events = List.of(); | ||
| } |
There was a problem hiding this comment.
AuditService.recent() delegates to a Spring Data repository method, which returns an empty List rather than null. The null-check here is likely unnecessary and can hide unexpected nulls; consider removing it and relying on the repository contract (or enforce non-null at the service boundary).
…oy-friendly local dev Spring Boot's default JDBC URL (jdbc:postgresql://localhost:5432/customer360) resolves to ::1 on most Linux stacks and opts in to SSL negotiation. Both break Keploy's eBPF proxy: the v6 connection is not hooked, and the SSL negotiation fails with EOF during the enableSSL packet exchange under the proxy's MITM. Default now binds the v4 loopback and disables SSL so out-of-the-box `keploy record` works without env overrides. Compose (`hostname: postgres`) and the k8s manifests already supply SPRING_DATASOURCE_URL, so the change is invisible to those flows. Reproduction: SPRING_DATASOURCE_URL unset, postgres on localhost → app boots but keploy record sees no v4 traffic → no postgres mocks captured. Root cause: localhost → ::1 on Debian/Ubuntu; postgresql driver negotiates SSL by default. Fix: 127.0.0.1 + ?sslmode=disable as the built-in default; SPRING_DATASOURCE_URL still overrides it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: slayerjain <shubhamkjain@outlook.com>
…clearer 502 semantics
The /api/v1/customers/{id}/360 aggregator fans out 3 parallel SAP OData
calls + 2 Postgres SELECTs + 1 audit INSERT. Under Keploy record, SAP
round-trips through the eBPF MITM pick up ~2× latency, and the existing
timeouts were tight enough that a single slow-but-successful GET could
push the optional fan-outs past the aggregate deadline (or trip the
resilience4j circuit) and degrade the whole view to an empty 200 — or
worse, surface a CallNotPermittedException as a 503 from an optional
branch.
Changes:
* safely() now also catches CallNotPermittedException explicitly and
falls back to Throwable-less Exception, so nothing thrown from an
optional fan-out can 502/503 the parent request.
* Raise read-timeout 30s → 45s, connect-timeout 10s → 15s, and
aggregate-timeout 25s → 50s. All are env-overridable via
SAP_{CONNECT,READ}_TIMEOUT_SECONDS and
CUSTOMER360_AGGREGATE_TIMEOUT_SECONDS so real BTP tenants with tight
SLAs can dial them back.
* safely() log now includes the upstream HTTP status on SapApiException
so operators can see at a glance whether a fan-out was a transport
error or an upstream 5xx.
Reproduction:
1. keploy record -c "java -jar target/customer360.jar"
2. curl localhost:8080/api/v1/customers/11/360
— intermittent 502 when the roles or addresses call trips slow-call
circuit breaker, even though partner is healthy.
Root cause:
aggregate-timeout (25s) < read-timeout (30s) × potential retries (3)
for the optional calls; CallNotPermittedException escapes safely()
via the default Exception-handler path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>
GET /api/v1/audit returned 500 because the route wasn't registered and Spring's fallback NoResourceFoundException was caught by the generic @ExceptionHandler(Exception.class), masquerading as an internal server error. The intuitive audit-centric path should just work, and unmapped routes in general should 404 cleanly rather than 500. Changes: * AuditController exposes GET /api/v1/audit as the canonical route, keeping GET /api/v1/customers/recent-views as a back-compat alias. Null-safes the result list so an empty audit table still returns `{"items":[],"count":0}` rather than falling through to Map.of's NPE-on-null. * GlobalExceptionHandler adds an explicit NoResourceFoundException handler that returns 404 with an RFC 7807 body, so typos / probing no longer surface as 500s. Reproduction: curl -i http://localhost:8080/api/v1/audit → HTTP 500 + "An unexpected error occurred" Log: NoResourceFoundException: No static resource api/v1/audit. Root cause: Controller mapped at /api/v1/customers/recent-views; no route for /api/v1/audit; Spring's static-resource fallback throws NoResourceFoundException which the catch-all Exception handler treated as a 500 (should be 404). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: slayerjain <shubhamkjain@outlook.com>
395e12f to
ceceb83
Compare
Keploy's GlobalNoise is typed `map[string]map[string][]string` — two
levels of nesting. The committed keploy.yml instead used flat dotted
keys under `globalNoise.global` (e.g. `header.X-Correlation-Id: []`,
`body.correlationId: []`), which parse as a single string key with a
literal dot in it. Keploy then never matched those keys against the
(header, field) / (body, field) lookups during noise suppression, so
the curated noise block was silently a no-op.
Re-nest every entry: `header.X-Correlation-Id` becomes
`header: { X-Correlation-Id: [] }`, and all the `body.*` keys move
under a single `body:` parent. The actuator/health DataSource status
comment is preserved above `status: []`.
Signed-off-by: slayerjain <shubhamkjain@outlook.com>
Keploy's HTTP recorder now persists upstream timeouts as synthesized 504 mocks with the original reqTimestamp (see task #185). On replay the mock is served instantly (millisecond-level elapsedMs) while the recorded response captured real ~26s SAP network latency, so body-diff on elapsedMs continues to fail test-16. Add elapsedMs: [] under globalNoise.global.body to suppress the diff. Note: the existing timestamp: [] entry does not cover this — timestamp and elapsedMs are distinct fields on different response objects, so an explicit entry is required for the synthesized-mock flow from #185. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: slayerjain <shubhamkjain@outlook.com>
Keploy's HTTP header matcher at pkg/matcher/utils.go CompareHeaders
lowercases the incoming header name (k -> strings.ToLower(k)) before
calling SubstringKeyMatch, but SubstringKeyMatch does a case-SENSITIVE
strings.Contains against the raw noise map keys. With a YAML noise key
of "X-Correlation-Id" the matcher evaluates
strings.Contains("x-correlation-id", "X-Correlation-Id")
which is always false, so the rule has no effect — the sap-demo-java
pipeline test-1 / test-2 / /actuator/health replays continue to surface
X-Correlation-Id diffs (observed in integrations pipeline 388).
Move the key to lowercase so the SubstringKeyMatch loop finds it.
Added a comment explaining the keploy-side contract so future
contributors don't re-capitalise the key.
No runtime behaviour change for the recording or recorded bodies.
Signed-off-by: slayerjain <shubhamkjain@outlook.com>
Summary
Fixes three runtime bugs surfaced by a local Keploy record/replay validation of the
sap-demo-javasample:Bug 1 — default JDBC URL is keploy-hostile. Spring Boot's default
jdbc:postgresql://localhost:5432/customer360resolveslocalhostto::1on most Linux hosts and opts in to SSL negotiation. Both break Keploy's eBPF proxy (v4-only hooks; SSL handshake returns EOF through the MITM). Default now binds127.0.0.1:5432and adds?sslmode=disable. Compose and k8s already overrideSPRING_DATASOURCE_URL, so the change is invisible to those flows.Bug 2 —
GET /api/v1/customers/{id}/360502s when an optional SAP call stalls. Under Keploy record, SAP sandbox latency roughly doubles; the oldaggregate-timeout-seconds: 25was tighter thanread-timeout-seconds: 30 × 3 retries, so a single slow optional call could trip the resilience4j circuit breaker and surface aCallNotPermittedExceptionas a 503 from an optional branch.safely()now explicitly catchesCallNotPermittedException(and falls through to aException-level catch so nothing can escape). Timeouts are bumped toconnect=15s,read=45s,aggregate=50s— all env-overridable.Bug 3 —
GET /api/v1/auditreturns 500 instead of 200. The endpoint never existed; the canonical route was/api/v1/customers/recent-views. Spring's fallbackNoResourceFoundExceptionwas caught by the catch-all@ExceptionHandler(Exception.class)and surfaced as a 500. Added/api/v1/auditas the canonical audit route (kept/recent-viewsas alias), null-safed the result, and added an explicitNoResourceFoundExceptionhandler so any unmapped route now returns a clean 404 RFC 7807 body.Test plan
All six checks pass on my host (Postgres 15, Java 21,
SAP_API_KEYpointing at the public sandbox).🤖 Generated with Claude Code