Skip to content

Retry KMS requests on transient errors#1953

Draft
rozza wants to merge 1 commit intomongodb:mainfrom
rozza:JAVA-5391
Draft

Retry KMS requests on transient errors#1953
rozza wants to merge 1 commit intomongodb:mainfrom
rozza:JAVA-5391

Conversation

@rozza
Copy link
Copy Markdown
Member

@rozza rozza commented Apr 29, 2026

Add libmongocrypt CAPI bindings for KMS retry support and wire retry logic through the sync and reactive driver stacks. Transient KMS HTTP and network errors are retried with backoff delays managed by libmongocrypt; retry is enabled unconditionally.

  • Add native bindings: mongocrypt_setopt_retry_kms, mongocrypt_kms_ctx_usleep, mongocrypt_kms_ctx_feed_with_retry, mongocrypt_kms_ctx_fail
  • Add sleepMicroseconds(), feedAndRetry(), fail() to MongoKeyDecryptor
  • Enable KMS retry unconditionally in MongoCryptImpl
  • Rewrite sync Crypt.decryptKey() with retry loop, timeout-aware
  • Add retry logic to reactive KeyManagementService.decryptKey()
  • Fix TlsChannelImpl.read() to preserve bytes delivered alongside close_notify (already fixed upstream in marianobarrios/tls-channel)
  • Add spec Section 24 KMS retry integration tests (sync + reactive)
  • Add Evergreen CI task for KMS retry tests

JAVA-5391

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

Add libmongocrypt CAPI bindings for KMS retry support and wire retry
logic through the sync and reactive driver stacks. Transient KMS HTTP
and network errors are retried with backoff delays managed by
libmongocrypt; retry is enabled unconditionally.

- Add native bindings: mongocrypt_setopt_retry_kms,
  mongocrypt_kms_ctx_usleep, mongocrypt_kms_ctx_feed_with_retry,
  mongocrypt_kms_ctx_fail
- Add sleepMicroseconds(), feedAndRetry(), fail() to MongoKeyDecryptor
- Enable KMS retry unconditionally in MongoCryptImpl
- Rewrite sync Crypt.decryptKey() with retry loop, timeout-aware
- Add retry logic to reactive KeyManagementService.decryptKey()
- Fix TlsChannelImpl.read() to preserve bytes delivered alongside
  close_notify (already fixed upstream in marianobarrios/tls-channel)
- Add spec Section 24 KMS retry integration tests (sync + reactive)
- Add Evergreen CI task for KMS retry tests

JAVA-5391
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds libmongocrypt-backed retry support for KMS requests and wires it through the sync and reactive driver encryption flows, including a small TLS-channel fix needed for correct KMS response handling and new CI coverage for the retry prose tests.

Changes:

  • Introduce new libmongocrypt CAPI/JNA bindings and surface them via MongoKeyDecryptor to drive sleep/backoff and retry decisions.
  • Implement retry loops in sync Crypt.decryptKey() and reactive KeyManagementService.decryptKey() using libmongocrypt’s retry signals and operation-timeout awareness.
  • Add KMS retry prose tests (sync + reactive) and an Evergreen task/script to run them.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
mongodb-crypt/src/main/com/mongodb/internal/crypt/capi/MongoKeyDecryptorImpl.java Implements new retry-related native calls (usleep, feed_with_retry, fail) on the KMS ctx wrapper.
mongodb-crypt/src/main/com/mongodb/internal/crypt/capi/MongoKeyDecryptor.java Extends decryptor API with retry hooks and a default initial KMS read size constant.
mongodb-crypt/src/main/com/mongodb/internal/crypt/capi/MongoCryptImpl.java Enables KMS retry option in libmongocrypt during initialization.
mongodb-crypt/src/main/com/mongodb/internal/crypt/capi/CAPI.java Adds JNA native bindings for KMS retry APIs.
driver-sync/src/main/com/mongodb/client/internal/Crypt.java Reworks KMS decryption to loop retries with backoff and timeout checks.
driver-reactive-streams/src/main/com/mongodb/reactivestreams/client/internal/crypt/KeyManagementService.java Adds reactive retry flow with libmongocrypt-provided delay and retry decisions.
driver-core/src/main/com/mongodb/internal/connection/tlschannel/impl/TlsChannelImpl.java Preserves bytes produced alongside TLS close_notify instead of immediately returning -1.
driver-sync/src/test/functional/com/mongodb/client/AbstractClientSideEncryptionKmsRetryProseTest.java Adds shared prose tests for KMS retry behaviors (TCP/HTTP retry, exhausted retries, timeout mid-retry).
driver-sync/src/test/functional/com/mongodb/client/ClientSideEncryptionKmsRetryProseTest.java Sync concrete test wiring for the shared retry prose tests.
driver-reactive-streams/src/test/functional/com/mongodb/reactivestreams/client/ClientSideEncryptionKmsRetryProseTest.java Reactive concrete test wiring (via sync adapter) for the shared retry prose tests.
.evergreen/run-kms-retry-tests.sh Adds CI script to run sync + reactive KMS retry prose tests with required trust material.
.evergreen/.evg.yml Adds Evergreen function/task/buildvariant wiring to execute the KMS retry test script.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .evergreen/run-kms-retry-tests.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants