Skip to content

fix(sap-demo-java): resolve three runtime bugs found during Keploy record/replay validation#127

Open
slayerjain wants to merge 6 commits intomainfrom
fix/sap-demo-sample-bugs
Open

fix(sap-demo-java): resolve three runtime bugs found during Keploy record/replay validation#127
slayerjain wants to merge 6 commits intomainfrom
fix/sap-demo-sample-bugs

Conversation

@slayerjain
Copy link
Copy Markdown
Member

Summary

Fixes three runtime bugs surfaced by a local Keploy record/replay validation of the sap-demo-java sample:

  • Bug 1 — default JDBC URL is keploy-hostile. Spring Boot's default jdbc:postgresql://localhost:5432/customer360 resolves localhost to ::1 on most Linux hosts and opts in to SSL negotiation. Both break Keploy's eBPF proxy (v4-only hooks; SSL handshake returns EOF through the MITM). Default now binds 127.0.0.1:5432 and adds ?sslmode=disable. Compose and k8s already override SPRING_DATASOURCE_URL, so the change is invisible to those flows.

  • Bug 2 — GET /api/v1/customers/{id}/360 502s when an optional SAP call stalls. Under Keploy record, SAP sandbox latency roughly doubles; the old aggregate-timeout-seconds: 25 was tighter than read-timeout-seconds: 30 × 3 retries, so a single slow optional call could trip the resilience4j circuit breaker and surface a CallNotPermittedException as a 503 from an optional branch. safely() now explicitly catches CallNotPermittedException (and falls through to a Exception-level catch so nothing can escape). Timeouts are bumped to connect=15s, read=45s, aggregate=50s — all env-overridable.

  • Bug 3 — GET /api/v1/audit returns 500 instead of 200. The endpoint never existed; the canonical route was /api/v1/customers/recent-views. Spring's fallback NoResourceFoundException was caught by the catch-all @ExceptionHandler(Exception.class) and surfaced as a 500. Added /api/v1/audit as the canonical audit route (kept /recent-views as alias), null-safed the result, and added an explicit NoResourceFoundException handler so any unmapped route now returns a clean 404 RFC 7807 body.

Test plan

# build
cd sap-demo-java && mvn -B -DskipTests package

# bring up postgres (docker-compose or existing container)
docker compose up -d postgres

# start the app (no env overrides — exercises the new defaults)
java -jar target/customer360.jar &

# Bug 1: verify the app binds v4 loopback out of the box
curl -sf http://localhost:8080/actuator/health | jq .status  # "UP"

# Bug 2: aggregate fan-out
curl -s -o /dev/null -w '%{http_code}\n' http://localhost:8080/api/v1/customers/202/360  # 200
curl -s -o /dev/null -w '%{http_code}\n' http://localhost:8080/api/v1/customers/11/360   # 200

# Bug 3: audit endpoint
curl -s -o /dev/null -w '%{http_code}\n' http://localhost:8080/api/v1/audit              # 200
curl -s -o /dev/null -w '%{http_code}\n' http://localhost:8080/api/v1/customers/recent-views  # 200 (back-compat)
curl -s -o /dev/null -w '%{http_code}\n' http://localhost:8080/api/v1/does-not-exist     # 404 (clean, not 500)

All six checks pass on my host (Postgres 15, Java 21, SAP_API_KEY pointing at the public sandbox).

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 22, 2026 13:05
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes runtime issues in the sap-demo-java sample that were surfaced during Keploy record/replay validation by making local defaults more Keploy-friendly, hardening optional SAP fan-out behavior, and correcting audit routing/404 handling.

Changes:

  • Updated default Postgres JDBC URL to prefer IPv4 loopback and disable SSL negotiation by default.
  • Increased SAP client and aggregate fan-out timeouts (env-overridable) and improved degradation behavior when resilience4j circuit breaker is open.
  • Added /api/v1/audit route (keeping /api/v1/customers/recent-views as an alias) and introduced an explicit NoResourceFoundException → 404 problem response mapping.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
sap-demo-java/src/main/resources/application.yml Adjusts default datasource URL and increases configurable SAP/aggregate timeout budgets.
sap-demo-java/src/main/java/com/tricentisdemo/sap/customer360/web/GlobalExceptionHandler.java Adds explicit 404 handling for unmapped routes via NoResourceFoundException.
sap-demo-java/src/main/java/com/tricentisdemo/sap/customer360/web/AuditController.java Makes /api/v1/audit a canonical endpoint and keeps legacy alias; adds defensive empty-list handling.
sap-demo-java/src/main/java/com/tricentisdemo/sap/customer360/service/Customer360AggregatorService.java Catches circuit-breaker-open errors in optional fan-out and expands defensive exception handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// and surfaces as a 500 — confusing for clients probing nearby
// routes (e.g. GET /api/v1/audit when they meant
// /api/v1/customers/recent-views).
log.info("Unmapped route path={}", req.getRequestURI());
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging every unmapped route at INFO can generate a lot of noise in production (404s from bots, typos, etc.) and make it harder to spot real issues. Consider downgrading this to DEBUG (or sampling/rate-limiting) while still returning the 404 problem response to the client.

Suggested change
log.info("Unmapped route path={}", req.getRequestURI());
if (log.isDebugEnabled()) {
log.debug("Unmapped route path={}", req.getRequestURI());
}

Copilot uses AI. Check for mistakes.
// Circuit breaker open — degrade the optional view rather than
// propagating a 503 out of the aggregator. The partner call
// (mandatory) has its own circuit and is handled upstream.
log.warn("{} skipped: circuit breaker open ({})", what, circuit.getMessage());
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A circuit breaker being open is an expected steady-state during upstream outages; logging it at WARN for every optional fan-out call can flood logs under load. Consider lowering this to INFO/DEBUG and/or adding some form of throttling so the degradation path remains observable without being noisy.

Suggested change
log.warn("{} skipped: circuit breaker open ({})", what, circuit.getMessage());
log.debug("{} skipped: circuit breaker open ({})", what, circuit.getMessage());

Copilot uses AI. Check for mistakes.
Comment on lines +195 to +200
} catch (Exception e) {
// Defensive: any checked exception that slipped through a
// supplier lambda (e.g. via sneaky-throws) should still degrade
// the optional fan-out, never 500 the parent request.
log.warn("{} failed ({}): {}", what, e.getClass().getSimpleName(), e.getMessage());
return List.of();
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The catch (Exception e) branch duplicates the catch (RuntimeException e) logic, but Supplier.get() cannot throw checked exceptions in normal Java typing. If the intent is to support checked exceptions, consider using a dedicated functional interface that declares throws Exception; otherwise, dropping the extra catch will simplify the flow without changing behavior.

Suggested change
} catch (Exception e) {
// Defensive: any checked exception that slipped through a
// supplier lambda (e.g. via sneaky-throws) should still degrade
// the optional fan-out, never 500 the parent request.
log.warn("{} failed ({}): {}", what, e.getClass().getSimpleName(), e.getMessage());
return List.of();

Copilot uses AI. Check for mistakes.
Comment on lines 25 to 26
@RestController
@RequestMapping("/api/v1/customers")
@Tag(name = "Audit", description = "Service-level access audit (Postgres read)")
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After removing the class-level @RequestMapping, the org.springframework.web.bind.annotation.RequestMapping import becomes unused. Please remove the unused import to avoid warnings / failing builds when unused imports are disallowed.

Copilot uses AI. Check for mistakes.
Comment on lines 42 to +45
List<AuditEvent> events = audit.recent();
if (events == null) {
events = List.of();
}
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AuditService.recent() delegates to a Spring Data repository method, which returns an empty List rather than null. The null-check here is likely unnecessary and can hide unexpected nulls; consider removing it and relying on the repository contract (or enforce non-null at the service boundary).

Copilot uses AI. Check for mistakes.
slayerjain and others added 3 commits April 23, 2026 03:51
…oy-friendly local dev

Spring Boot's default JDBC URL (jdbc:postgresql://localhost:5432/customer360)
resolves to ::1 on most Linux stacks and opts in to SSL negotiation. Both
break Keploy's eBPF proxy: the v6 connection is not hooked, and the SSL
negotiation fails with EOF during the enableSSL packet exchange under the
proxy's MITM.

Default now binds the v4 loopback and disables SSL so out-of-the-box
`keploy record` works without env overrides. Compose (`hostname: postgres`)
and the k8s manifests already supply SPRING_DATASOURCE_URL, so the change
is invisible to those flows.

Reproduction:
  SPRING_DATASOURCE_URL unset, postgres on localhost → app boots but
  keploy record sees no v4 traffic → no postgres mocks captured.

Root cause:
  localhost → ::1 on Debian/Ubuntu; postgresql driver negotiates SSL by
  default.

Fix:
  127.0.0.1 + ?sslmode=disable as the built-in default; SPRING_DATASOURCE_URL
  still overrides it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>
…clearer 502 semantics

The /api/v1/customers/{id}/360 aggregator fans out 3 parallel SAP OData
calls + 2 Postgres SELECTs + 1 audit INSERT. Under Keploy record, SAP
round-trips through the eBPF MITM pick up ~2× latency, and the existing
timeouts were tight enough that a single slow-but-successful GET could
push the optional fan-outs past the aggregate deadline (or trip the
resilience4j circuit) and degrade the whole view to an empty 200 — or
worse, surface a CallNotPermittedException as a 503 from an optional
branch.

Changes:
  * safely() now also catches CallNotPermittedException explicitly and
    falls back to Throwable-less Exception, so nothing thrown from an
    optional fan-out can 502/503 the parent request.
  * Raise read-timeout 30s → 45s, connect-timeout 10s → 15s, and
    aggregate-timeout 25s → 50s. All are env-overridable via
    SAP_{CONNECT,READ}_TIMEOUT_SECONDS and
    CUSTOMER360_AGGREGATE_TIMEOUT_SECONDS so real BTP tenants with tight
    SLAs can dial them back.
  * safely() log now includes the upstream HTTP status on SapApiException
    so operators can see at a glance whether a fan-out was a transport
    error or an upstream 5xx.

Reproduction:
  1. keploy record -c "java -jar target/customer360.jar"
  2. curl localhost:8080/api/v1/customers/11/360
     — intermittent 502 when the roles or addresses call trips slow-call
     circuit breaker, even though partner is healthy.

Root cause:
  aggregate-timeout (25s) < read-timeout (30s) × potential retries (3)
  for the optional calls; CallNotPermittedException escapes safely()
  via the default Exception-handler path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>
GET /api/v1/audit returned 500 because the route wasn't registered and
Spring's fallback NoResourceFoundException was caught by the generic
@ExceptionHandler(Exception.class), masquerading as an internal server
error. The intuitive audit-centric path should just work, and unmapped
routes in general should 404 cleanly rather than 500.

Changes:
  * AuditController exposes GET /api/v1/audit as the canonical route,
    keeping GET /api/v1/customers/recent-views as a back-compat alias.
    Null-safes the result list so an empty audit table still returns
    `{"items":[],"count":0}` rather than falling through to Map.of's
    NPE-on-null.
  * GlobalExceptionHandler adds an explicit NoResourceFoundException
    handler that returns 404 with an RFC 7807 body, so typos / probing
    no longer surface as 500s.

Reproduction:
  curl -i http://localhost:8080/api/v1/audit
  → HTTP 500 + "An unexpected error occurred"
  Log: NoResourceFoundException: No static resource api/v1/audit.

Root cause:
  Controller mapped at /api/v1/customers/recent-views; no route for
  /api/v1/audit; Spring's static-resource fallback throws
  NoResourceFoundException which the catch-all Exception handler treated
  as a 500 (should be 404).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>
@slayerjain slayerjain force-pushed the fix/sap-demo-sample-bugs branch from 395e12f to ceceb83 Compare April 22, 2026 22:22
slayerjain and others added 3 commits April 23, 2026 10:25
Keploy's GlobalNoise is typed `map[string]map[string][]string` — two
levels of nesting. The committed keploy.yml instead used flat dotted
keys under `globalNoise.global` (e.g. `header.X-Correlation-Id: []`,
`body.correlationId: []`), which parse as a single string key with a
literal dot in it. Keploy then never matched those keys against the
(header, field) / (body, field) lookups during noise suppression, so
the curated noise block was silently a no-op.

Re-nest every entry: `header.X-Correlation-Id` becomes
`header: { X-Correlation-Id: [] }`, and all the `body.*` keys move
under a single `body:` parent. The actuator/health DataSource status
comment is preserved above `status: []`.

Signed-off-by: slayerjain <shubhamkjain@outlook.com>
Keploy's HTTP recorder now persists upstream timeouts as synthesized 504
mocks with the original reqTimestamp (see task #185). On replay the mock
is served instantly (millisecond-level elapsedMs) while the recorded
response captured real ~26s SAP network latency, so body-diff on
elapsedMs continues to fail test-16.

Add elapsedMs: [] under globalNoise.global.body to suppress the diff.
Note: the existing timestamp: [] entry does not cover this — timestamp
and elapsedMs are distinct fields on different response objects, so an
explicit entry is required for the synthesized-mock flow from #185.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>
Keploy's HTTP header matcher at pkg/matcher/utils.go CompareHeaders
lowercases the incoming header name (k -> strings.ToLower(k)) before
calling SubstringKeyMatch, but SubstringKeyMatch does a case-SENSITIVE
strings.Contains against the raw noise map keys. With a YAML noise key
of "X-Correlation-Id" the matcher evaluates

    strings.Contains("x-correlation-id", "X-Correlation-Id")

which is always false, so the rule has no effect — the sap-demo-java
pipeline test-1 / test-2 / /actuator/health replays continue to surface
X-Correlation-Id diffs (observed in integrations pipeline 388).

Move the key to lowercase so the SubstringKeyMatch loop finds it.
Added a comment explaining the keploy-side contract so future
contributors don't re-capitalise the key.

No runtime behaviour change for the recording or recorded bodies.

Signed-off-by: slayerjain <shubhamkjain@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants