Skip to content

[spark] Reduce paimon-spark-4.0 shadows via SparkShim copy factories#7721

Open
kerwin-zk wants to merge 1 commit intoapache:masterfrom
kerwin-zk:spark-shim-cleanup
Open

[spark] Reduce paimon-spark-4.0 shadows via SparkShim copy factories#7721
kerwin-zk wants to merge 1 commit intoapache:masterfrom
kerwin-zk:spark-shim-cleanup

Conversation

@kerwin-zk
Copy link
Copy Markdown
Contributor

@kerwin-zk kerwin-zk commented Apr 28, 2026

Purpose

Follow-up to #7648 (Spark 4.1 module). After the reverse-shim layout landed, three of the files copied into paimon-spark-4.0/src/main only differed across versions because of case class .copy(...) calls on Spark types whose arity changed between 4.0.2 and 4.1.1:

  • DataSourceV2Relation gained Option[TimeTravelSpec] (8 → 9 fields) — relation.copy(table = ...) compiled against 4.1.1 emits copy\$default\$9, which crashes on 4.0 with NoSuchMethodError.
  • TableSpec gained Seq[Constraint] (8 → 9 fields) — same problem for spec.copy(location = ...) and spec.copy(properties = ...).

Per-version scalac is the only thing that knows the right copy\$default\$N to emit, so we route those three calls through new SparkShim factories (one per call site). The implementations live in Spark3Shim / Spark4Shim (plus the 4.0 override), and the cross-version source files no longer need to be physically duplicated.

Tests

CI

API and Format

No new public API. Three internal helper methods added to org.apache.spark.sql.paimon.shims.SparkShim:

  • copyDataSourceV2Relation(relation, newTable)
  • copyTableSpecLocation(spec, location)
  • copyTableSpecProperties(spec, properties)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant