auth: wire secure-storage cache into CLI#5013
Merged
simonfaltum merged 13 commits intomainfrom Apr 22, 2026
Merged
Conversation
661420f to
1fc0c22
Compare
ea3802a to
4dd3b58
Compare
b04e515 to
0bd9c0e
Compare
This was referenced Apr 22, 2026
4dd3b58 to
014a6a6
Compare
a8c822e to
4d6f006
Compare
renaudhartert-db
approved these changes
Apr 22, 2026
7fc4937 to
f50a274
Compare
4d6f006 to
ca2c821
Compare
jamesbroadhead
pushed a commit
to jamesbroadhead/cli
that referenced
this pull request
Apr 22, 2026
) ## Stack Entry point for the opt-in secure token storage work. Review and merge top-to-bottom: 1. **databricks#5056 auth: import FileTokenCache into CLI and wrap cache for dual-write (this PR)** 2. databricks#5008 libs/auth/storage: add dormant secure-storage foundation 3. databricks#5013 auth: wire secure-storage cache into CLI This PR is also the first of a separate 3-PR sequence that moves file token cache ownership from the SDK to the CLI. That sequence (SDK PR removing the internal dual-write, then SDK bump in the CLI) can proceed in parallel with databricks#5008 and databricks#5013. ## Why First of 3 PRs moving file-based OAuth token cache ownership from the Go SDK to the CLI. Today the SDK owns both the cache interface and the file-backed implementation, including the dual-write-under-host-key convention. Long-term we want the SDK to stop owning persistence: the OAuth flow and cache interface stay, but file format and host-key conventions move to the CLI. This unblocks secure storage backends and Renaud's Session model on a cleaner foundation. This PR imports the cache into the CLI and wires it everywhere. Nothing is deleted from the SDK yet. PRs 2 and 3 (SDK PR, then SDK bump) finish the move. ## Changes **Before:** CLI relied on the SDK's default `FileTokenCache`. Dual-write to the legacy host-based key happened inside `PersistentAuth.Challenge()` and `refresh()` via the SDK's internal `dualWrite`. **Now:** CLI owns its own `FileTokenCache` at `libs/auth/storage/file_cache.go`, a near-verbatim copy from the SDK (same JSON schema, same path `~/.databricks/token-cache.json`, same permissions). A new `storage.DualWritingTokenCache` wraps the file cache so that every write through it under the primary key is also mirrored under the legacy host key. Every `u2m.NewPersistentAuth` call site in the CLI now passes `u2m.WithTokenCache(storage.NewDualWritingTokenCache(fileCache, arg))`. Because mirroring happens inside the cache's `Store` method, every SDK-internal write (Challenge, refresh, discovery) dual-writes automatically. No call site needs to remember to invoke a helper, so refresh paths (`Token()`, `ForceRefreshToken()`) preserve cross-SDK compatibility just like login paths do. The SDK is unchanged. It still dual-writes internally, so the two writes hit the same file with the same keys and bytes, i.e. idempotent. Zero user-visible behavior change. Files touched: - `libs/auth/storage/file_cache.go` + test (new) - `libs/auth/storage/dual_writing_cache.go` + test (new) - `cmd/auth/login.go`, `cmd/auth/token.go`, `cmd/auth/logout.go` - `libs/auth/credentials.go` - `NEXT_CHANGELOG.md` **Lint-driven deltas from SDK:** - `os.UserHomeDir()` is forbidden in the CLI, uses `env.UserHomeDir(ctx)` instead. Required threading `ctx` through `NewFileTokenCache`. - `os.IsNotExist(err)` is forbidden, uses `errors.Is(err, fs.ErrNotExist)`. **Known edge case:** Tokens that exist only under the legacy host key (users who logged in before profile-keyed writes existed and never re-ran `auth login --profile`) keep working for now because the SDK's internal dualWrite still runs. After PR 2 (SDK stops dual-writing), re-login will be required for those users. Minimal impact. ## Test plan - [x] Unit tests for `file_cache_test.go` (port of SDK tests) - [x] Unit tests for `dual_writing_cache_test.go` covering primary-key mirroring, non-primary passthrough, no host-key, host-key-equals-primary, discovery with populated/empty `GetDiscoveredHost`, and Lookup delegation - [x] `make checks` and `make test` pass - [ ] Manual smoke test of `databricks auth login`, `auth token`, `auth logout` on a live profile before merging
Base automatically changed from
simonfaltum/cli-ga-secure-storage-foundation
to
main
April 22, 2026 18:26
Introduces a thin helper that resolves the CLI's token storage mode via libs/auth/storage.ResolveStorageMode and returns a cache.TokenCache. The helper is split into a public convenience form and an injectable core (newAuthCacheWith) so unit tests exercise the secure path with a fake cache without touching the real OS keyring. Follow-up commits wire this into the auth command call sites.
cache.NewFileTokenCache has a variadic signature (opts ...FileTokenCacheOption), which cannot be assigned directly to a non-variadic field. Adds a one-line explanation so future readers do not 'simplify' the closure back into a direct function reference and break the build.
Adds a hidden --secure-storage BoolVar to auth login and routes the resolved token cache into both NewPersistentAuth call sites (main flow and discoveryLogin). Users opt into OS-native secure storage either per invocation via the flag or for the whole shell via DATABRICKS_AUTH_STORAGE=secure. The default storage mode remains legacy. Flag is MarkHidden because MS1 is experimental; release notes document the env var for discovery.
Resolves the token cache via newAuthCache at the two NewPersistentAuth call sites inside cmd/auth/token.go (loadToken via newTokenCommand and runInlineLogin). No flag is added: per the rollout contract, auth token reads from whichever mode the resolver selects via DATABRICKS_AUTH_STORAGE or [__settings__].auth_storage.
Replaces the direct cache.NewFileTokenCache() call with newAuthCache so that logout targets whichever backend ResolveStorageMode selects. The resolver still defaults to legacy, so existing flows are unchanged; once a user opts into secure storage, logout clears the keyring entry instead of the file cache.
Adds three acceptance tests under acceptance/cmd/auth/storage-modes/: - invalid-env: DATABRICKS_AUTH_STORAGE=bogus surfaces a clear error - legacy-env-default: explicit legacy mode behaves like the default path - hidden-flag: --secure-storage is hidden from auth login --help Secure-mode end-to-end behavior is covered by unit tests in cmd/auth/cache_test.go because acceptance runners cannot safely exercise the real OS keyring on macOS (would write to Keychain) or Linux (no D-Bus session in CI).
Storage backend is a per-machine setting, not a per-invocation choice. Keeping a write-time flag on login while token/logout rely on the resolver creates an asymmetry: login --secure-storage writes to the keyring, but a later auth token without the env var defaults to the legacy file cache and can't find the token. Resolution is now always env -> [__settings__].auth_storage -> default. login.go matches token.go and logout.go: newAuthCache(ctx, "") with no per-call override from a flag. Also drops the hidden-flag acceptance test (no flag to hide) and updates NEXT_CHANGELOG to advertise the env var and config key instead of the flag.
Drop MS1/MS3 comment markers from cache.go and convert discoveryLogin to take a discoveryLoginInputs struct for readability (8 positional args was over the threshold). No behavior change. Co-authored-by: Isaac
The file factory was calling the SDK's cache.NewFileTokenCache(), which is being phased out in favor of the CLI's own storage.NewFileTokenCache(ctx) imported in #5056. Route ResolveCache through it so that legacy and plaintext modes share a single file cache implementation owned by the CLI. Co-authored-by: Isaac
Configure already resolves the correct cache via storage.ResolveCache and passes it via u2m.WithTokenCache. The persistentAuth helper opened a second storage.FileTokenCache and prepended its own WithTokenCache to opts. Because u2m options are last-one-wins, the keyring still won, but we were opening a file cache on every workspace client just to have it immediately overridden. Remove the hardcoded file cache and rely on the caller to supply one. Now there is a single place that manages the token cache (ResolveCache in Configure) and the helper is a thin wrapper around u2m.NewPersistentAuth. Co-authored-by: Isaac
ca2c821 to
e5d1e3c
Compare
… code Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack
Final PR in the opt-in secure token storage stack. Review and merge top-to-bottom:
Depends on #5008 for the
StorageModeresolver andKeyringCache, and on #5056 for the CLI-ownedFileTokenCachethatResolveCachenow returns in legacy mode.Why
Opt-in OS-native secure token storage for every CLI command that authenticates via the
databricks-clistrategy. Users turn it on by settingDATABRICKS_AUTH_STORAGE=secure(per-shell) or[__settings__].auth_storage = securein.databrickscfg(per-machine). Everyone else stays on the legacy file-backed cache. The default does not change.Storage backend is a per-machine setting, not a per-invocation choice. This PR deliberately ships no
--secure-storageflag so that login, token, logout, and every other command can never drift apart on the same machine.Changes
Before:
auth login,auth token,auth logout, and every workspace client built throughCLICredentialsalways went through the SDK's default file-backedTokenCache.Now: every path that talks to OAuth runs through a single resolver.
libs/auth/storage/cache.go:ResolveCache(ctx, override)resolves the mode viaResolveStorageModeand returns the correspondingcache.TokenCache. Split into a public form and an injectable core (resolveCacheWith) so unit tests exercise the secure path with a fake cache. Production always passesoverride = ""and relies on env -> config -> default. Legacy and plaintext modes return the CLI-ownedstorage.FileTokenCachefrom auth: import FileTokenCache into CLI and wire DualWrite #5056, not the SDK's file cache.auth login,auth token(vianewTokenCommand+runInlineLogin), andauth logoutcallstorage.ResolveCache(ctx, "")and plumb the resolved cache intou2m.WithTokenCache(...)at everyNewPersistentAuthcall site.libs/auth.CLICredentials.Configurealso routes throughResolveCacheso every workspace client built through the CLI credentials strategy reads from the same backend login writes to. This coversauth profiles,jobs list,clusters list,bundle deploy, and every other non-auth command. Without this, secure-storage users would hit "cache: token not found" on the first non-auth command.dualWriteLegacyHostKeyhelper centralises the post-Challenge mirror to the legacy host-based key. Runs only when mode islegacyso secure-mode users do not end up with duplicate keyring entries.discoveryLogintakes adiscoveryLoginInputsstruct after review feedback (8 positional args was over the threshold).CLICredentials.persistentAuthno longer opens its own file cache; it relies entirely on the caller to passWithTokenCache. Previously it opened a file cache and prepended aWithTokenCacheoption, only to have it immediately overridden by the keyring one appended byConfigure(last-one-wins). Single source of truth now.acceptance/cmd/auth/storage-modes/cover the two CLI-visible behaviors that do not require an OS keyring: invalid env var surfaces a clear error, and explicit legacy mode behaves identically to the default path.Out of scope:
plaintextstorage implementation. The resolver acceptsplaintext; a follow-up will route it to a non-dual-write file backend.[__settings__].auth_storage. Users hand-edit the config for now.token-cache.jsonentries. Users re-login after upgrading.secure.Known limitation: duplicate keyring entries until SDK bump
The CLI currently pins
databricks-sdk-go v0.128.0, which still containsPersistentAuth.dualWriteand calls it from bothChallenge()andrefresh(). dualWrite stores the token underGetCacheKey()(profile name) ANDhostCacheKey()(the host URL). In legacy mode this mirrors into the file cache, which is the desired backward-compat behavior. In secure mode it mirrors into the keyring, producing two entries per login:databricks-cli/<profile>-> tokendatabricks-cli/https://<host>-> same tokendatabricks/databricks-sdk-go#1646 removes the SDK-side
dualWriteand the host-key fallback inloadToken. It is already merged to SDK main but blocked on #5056 shipping in a CLI release before it can be released itself. Once the SDK ships a new version and the CLI bumps to it, secure mode will produce a single keyring entry per login and thedualWriteLegacyHostKeyhelper in this PR remains the only source of the legacy host-key mirror (file cache only, by design).No code change needed in this PR; the resolver, helper gating, and CLICredentials wiring are already correct for the post-bump world.
Test plan
ResolveCachecovering default legacy, explicit override, env-var selection, plaintext fallback, invalid override, invalid env, and file-factory error propagation. Secure path uses a fakecache.TokenCacheso CI never touches the OS keyring.CLICredentials.ConfigureconfirmingResolveCacheis invoked and the resulting cache is passed toNewPersistentAuthviaWithTokenCache.acceptance/cmd/auth/storage-modes/:invalid-env(bogusDATABRICKS_AUTH_STORAGEsurfaces a clear error viaauth token) andlegacy-env-default(explicit legacy mode clears the file cache viaauth logout, matching default behavior).make checks,make test,make lintpass.cmd/auth/login,cmd/auth/token,cmd/auth/logout,cmd/auth/profilesacceptance test suites still pass (regression).DATABRICKS_AUTH_STORAGE=secure, runauth login, thenauth profiles,auth token,jobs list,auth logout; confirm the token goes to the keyring and reads from there.