[ES-1804970] Fix CloudFetch returning stale column names from cached results by sreekanth-db · Pull Request #351 · databricks/databricks-sql-go

sreekanth-db · 2026-04-21T08:17:32Z

Summary

Fixes a bug where arrow.Record.Schema() returns stale column aliases when CloudFetch serves cached Arrow IPC files from a structurally identical prior query with different AS aliases.

Root cause: NewCloudBatchIterator was not receiving the authoritative schema bytes from GetResultSetMetadata, unlike the local batch path which already had this. CloudFetch Arrow IPC files have column names baked in from the original query, and the driver was reading them as-is.
Fix: Pass arrowSchemaBytes (the authoritative schema from GetResultSetMetadata) into NewCloudBatchIterator. After records are deserialized from the IPC stream, replace the stale schema with the authoritative one using array.NewRecord() (zero-copy — shares underlying column data, only swaps metadata).

Changes

arrowRecordIterator.go — Pass ri.arrowSchemaBytes to NewCloudBatchIterator in newBatchIterator()
arrowRows.go — Pass schemaBytes to NewCloudBatchIterator in NewArrowRowScanner()
batchloader.go — Core fix:
- NewCloudBatchIterator accepts arrowSchemaBytes, parses into *arrow.Schema, stores on batchIterator
- batchIterator.Next() applies override schema to CloudFetch records only (local path is untouched, overrideSchema is nil)
- Added schemaFromIPCBytes() helper
- Field count validation guard to prevent panics on schema mismatch
- Schema parse failure logged at Warn level
batchloader_test.go — Added TestCloudFetchSchemaOverride with two subtests:
- Verifies stale column names ["id","name"] are overridden to ["x","y"]
- Verifies nil schema bytes pass through original names unchanged

Who is affected

Go driver users with CloudFetch enabled (WithCloudFetch(true)) who read arrow.Record.Schema() directly. Python, ODBC, and JDBC drivers are not affected.

Test plan

All existing unit tests pass (37 tests in internal/rows/arrowbased/)
New unit test TestCloudFetchSchemaOverride covers the override and no-override paths
Verified end-to-end against a real Databricks warehouse using samples.tpch.lineitem (~30M rows) with two queries differing only in column aliases — confirmed arrow.Record.Schema() now returns correct aliases

This pull request was AI-assisted by Isaac.

…results When the server result cache serves Arrow IPC files from a prior query, the embedded schema contains stale column aliases. The Go driver's CloudFetch path read these stale names directly, while the local path already used the authoritative schema from GetResultSetMetadata. Pass the authoritative schema bytes into NewCloudBatchIterator and replace stale column names on deserialized records using array.NewRecord, which is zero-copy (shares underlying column data). Co-authored-by: Isaac Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

…dfetch-stale-column-names Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com> # Conflicts: # internal/rows/arrowbased/arrowRecordIterator.go # internal/rows/arrowbased/arrowRows.go # internal/rows/arrowbased/batchloader_test.go

## Summary Bump `DriverVersion` to `1.11.0` and add the v1.11.0 section to `CHANGELOG.md`. ### Changes since v1.10.0 - Enable telemetry by default with DSN-controlled priority (#320, #321, #322, #349) - Add SPOG (Custom URL) routing support via `x-databricks-org-id` header (#347) - Add statement-level query tag support (#341) - Add AI coding agent detection to User-Agent header (#326) - Fix CloudFetch returning stale column names from cached results (#351) - Fix resource leak: close staging Rows in execStagingOperation (#325) Internal/infra-only changes are omitted from the user-facing notes (CI hardening, dependabot bumps, CODEOWNERS). ## Test plan - [x] `go build ./...` clean - [x] `go test ./... -count=1 -short` passes locally ## Next steps after merge 1. Tag the merge commit as `v1.11.0` and push the tag 2. Trigger `peco-databricks-sql-go` in secure-public-registry-releases-eng with `ref=v1.11.0`, `dry-run=true` to verify 3. Re-run with `dry-run=false` for the actual release NO_CHANGELOG=true This pull request was AI-assisted by Isaac. Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

sreekanth-db added 4 commits April 10, 2026 18:35

Fix lint: check error return values in test handler

de6c69e

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

Merge branch 'main' into fix/ES-1804970-cloudfetch-stale-column-names

24aac51

gopalldb approved these changes Apr 21, 2026

View reviewed changes

Merge branch 'main' into fix/ES-1804970-cloudfetch-stale-column-names

117e8da

sreekanth-db enabled auto-merge (squash) April 21, 2026 09:24

sreekanth-db merged commit 3c0f7e4 into main Apr 21, 2026
3 checks passed

sreekanth-db deleted the fix/ES-1804970-cloudfetch-stale-column-names branch April 21, 2026 09:29

vikrantpuppala mentioned this pull request Apr 21, 2026

Prepare for v1.11.0 release #352

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ES-1804970] Fix CloudFetch returning stale column names from cached results#351

[ES-1804970] Fix CloudFetch returning stale column names from cached results#351
sreekanth-db merged 5 commits intomainfrom
fix/ES-1804970-cloudfetch-stale-column-names

sreekanth-db commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sreekanth-db commented Apr 21, 2026

Summary

Changes

Who is affected

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants