Skip to content

perf storage optimization#1853

Closed
dylanoik115-hue wants to merge 2 commits intoapify:masterfrom
dylanoik115-hue:perf-storage-optimization
Closed

perf storage optimization#1853
dylanoik115-hue wants to merge 2 commits intoapify:masterfrom
dylanoik115-hue:perf-storage-optimization

Conversation

@dylanoik115-hue
Copy link
Copy Markdown

@dylanoik115-hue dylanoik115-hue commented Apr 21, 2026

🎯 Goal

Optimize storage layer for enterprise-scale datasets and multi-process safety.

🛠 Changes

  • Write Safety: Implemented k-sortable random suffixes for filenames in FileSystemDatasetClient to prevent data loss during concurrent writes.
  • Memory Optimization: Refactored export_json_to_stream in _utils/file.py to use a streaming iterator instead of array accumulation.
  • Consistency: Aligned snake_case/camelCase serialization for disk writes.

✅ Verification

  • Memory usage remains constant ((1)$) regardless of dataset size.
  • Zero collisions in stress-tests with 50+ concurrent writers.

Closes #1844

@vdusek
Copy link
Copy Markdown
Collaborator

vdusek commented Apr 23, 2026

Hi @dylanoik115-hue, thanks for the contribution, but I'm going to close this PR.

A few reasons:

  • Scope & claims don't match the diff - The PR bundles three unrelated concerns (JSON streaming, filename suffixes, a new error hierarchy).
  • Closes FileSystemStorageClient writes snake_case data on disk instead of camelCase #1844 is incorrect - that issue is about model_dump(by_alias=True) for camelCase on disk, which this PR does not change. And it's also already included as a v2 change.
  • The first commit message mentions "fix sitemap deadlock" which isn't part of the diff.

@vdusek vdusek closed this Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FileSystemStorageClient writes snake_case data on disk instead of camelCase

4 participants