Benchmarks Website Version 3 by connortsui20 · Pull Request #7643 · vortex-data/vortex

connortsui20 · 2026-04-26T18:45:00Z

Summary

Rewrites the benchmarks website (again).

Design

Instead of a single data.json.gz file that we CAS from the benchmarks, this is a full server binary that manages a duckdb database and allows POST /api/ingest from each of the benchmarks via an emitter. The website itself is then SSR with hydration. I believe that this is the design that we actually want the website to be in, as it is much more maintainable and extensible than previous iterations.

Mostly llm-engineered but with a lot of manual direction:

Single Rust server binary with axum (HTTP) + maud (server-rendered HTML) + DuckDB (embedded analytical DB) + Chart.js. All static assets include_bytes! into the binary.
DuckDB database with one table per benchmark fact (5 fact tables total: compression time, query measurement, vector search, RAG, random access). Backup is just a copy of the file (maybe we can just use the WAL?).
Data ingestion is via POST /api/ingest which accepts versioned JSON envelopes, bearer-token gated. CI pushes results.
MIGRATION (finally): a one-shot migrator ports v2 history forward. I've verified that it (mostly) works, there are a few bugs but I will fix them. We should decide what data we should keep and what we should get rid of.
The migrator runs every v2 record through a classifier that either routes it into one of the 5 fact tables or explicitly skips it with a typed reason (legacy random-access shape, historical memory metric, etc.).
Charts and groups identified by <prefix>.<base64url(serde_json(ChartKey|GroupKey))>. Round-trips through the URL with no DB lookup.
Three HTML routes: / (landing), /chart/{slug} and /group/{slug} (permalinks). One JSON route: GET /api/chart/{slug}.
Deploy is one server binary, one DuckDB file, one INGEST_BEARER_TOKEN env var. SSR means no frontend build step; the only client-side JS is the single chart-init.js.

UI/UX (TBD, the new relational database backend gives us a lot more options now so this could be better):

The landing page is a list of collapsible <details> per group, ordered to match v2. The first group opens by default with its chart data inlined for fast first paint; the rest lazy-fetch via the JSON API the first time they're expanded.
Charts hydrate from inline <script id="chart-data-N"> JSON paired with <canvas data-chart-index="N">. An IntersectionObserver only constructs the Chart.js instance once the canvas scrolls into view.
Each chart should own its own toolbar (scope / Y-axis / absolute vs % of baseline). Scope is zoom, not refetch: each chart pulls a generous slice of commits once, and the buttons + slider just adjust the visible range via chart.update("none"). Mouse wheel pans through history.
URL state (?n=&y=&mode=&hidden=) is honored only on the permalink pages. The landing page always opens at defaults; if you want to share a specific view, share the chart permalink.

Still some work to do, will update this design list later.

Testing

Snapshot testing with insta and seeded by hitting the ingest endpoint that is in-process.

codspeed-hq · 2026-04-26T18:48:45Z

Merging this PR will degrade performance by 26.23%

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 2 improved benchmarks
❌ 3 regressed benchmarks
✅ 1125 untouched benchmarks
⏩ 33 skipped benchmarks¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.