Skip to content

Epic: Improved List support #7679

@a10y

Description

@a10y

Description

Vortex supports list types, but currently our write support is naive.

The current builtin writer will emit a single Flat layout for any list chunks, with a maximum size of 2MB (configurable, but fixed).

┌──────────────────────────────────────────────────────────────────────┐
│                                                                      │
│                                                                      │
│  ┌───────────────────────────────────────────List Array (memory)──┐  │
│  │           ┌───────┐                                            │  │
│  │offsets    │       │                                            │  │
│  │           └───────┘                                            │  │
│  │                                                                │  │
│  │           ┌──────────────────────────────────────┐             │  │
│  │elements   │                                      │             │  │
│  │           └──────────────────────────────────────┘             │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                                                                      │
│                                                                      │
│                                                                      │
│                                                                      │
│                 ┌─────── Flat Layout ──────┐                         │
│                 ├───────────────┬──────────┤                         │
│                 │    offsets    │ elements │     On-disk format      │
│                 ├───────────────┴──────────┤                         │
│                 │                          │                         │
│                 │                          │                         │
│                 │                          │                         │
│                 │         elements         │                         │
│                 │                          │                         │
│                 │                          │                         │
│                 │                          │                         │
│                 │                          │                         │
│                 └──────────────────────────┘                         │
│                                                                      │
└──────────────────────────────────────────────────────────────────────┘

For simple lists of primitives or strings, this is actually fine. Bulk access is IO-efficient, since the elements and their corresponding offsets are fetched in a single read. Random access will be on-par with access to normal primitive/string arrays.

However, there are notable cons, especially when dealing with nested Lists:

  1. Chunks are smaller as the 2MB budget is shared across all nested fields + offsets for the chunk
  2. No ability to pushdown projections through list elements at IO time. E.g. if you have List<Struct{x,y,z}> there's no way to only read list.x, you need to read all elements
  3. No ability to short-circuit expression execution using just offsets or just validity, e.g. select len(list_col) should just need to read offset but instead it forces a read of the whole elements child as well.

We're considering two different options to improve IO efficiency here:

  1. Adding a new explicit ListLayout to our default writer, which shreds apart the list offsets, validity and elements children.
  2. Implementing sub-segment reads. This is a very pervasive change, but allows pushing down reads of portions of a segment at the IO layer for all Flat layouts, not just for lists.

There have been previous attempts to add (1), there are some notable challenges

  • Neither component (offset or elements) is row-aligned, which means reporting splits is tricky
  • To do this well, we probably need a Map or Lambda expression to pull out an expression to push down over the elements. Along with other list-specific expressions/aggregates

Metadata

Metadata

Assignees

No one assigned

    Labels

    epicPublic roadmap umbrella for a major initiative, with work tracked in sub-issues.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions