Fix UB in ColumnStringBlock::AppendUnsafe when appending empty string_view#489
Open
iliaal wants to merge 1 commit intoClickHouse:masterfrom
Open
Fix UB in ColumnStringBlock::AppendUnsafe when appending empty string_view#489iliaal wants to merge 1 commit intoClickHouse:masterfrom
iliaal wants to merge 1 commit intoClickHouse:masterfrom
Conversation
memcpy's source pointer is declared nonnull regardless of the size
argument. When AppendUnsafe is called with an empty string_view that
was constructed from an empty std::string, str.data() can be null
(the standard allows it), so UBSan flags every empty-string append:
runtime error: null pointer passed as argument 2,
which is declared to never be null
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
clickhouse/columns/string.cpp:139:21
Guard the memcpy with a zero-size short-circuit. Behavior is
unchanged for non-empty inputs and the returned string_view still
points at pos with length 0 for empty inputs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ColumnStringBlock::AppendUnsafecallsmemcpy(pos, str.data(), str.size())without short-circuiting onstr.size() == 0. When the inputstring_viewwas built from an emptystd::string,str.data()can benullptr, and glibc declaresmemcpy's source pointer with__attribute__((nonnull))regardless of the size argument. UBSan flags every empty-string append:The bug is benign in practice because libc no-ops
memcpy(_, NULL, 0), but the false positive fires on any workload that round-trips empty values throughStringorLowCardinality(String)columns. That makes-fsanitize=undefinedbuilds noisy and hides real findings.Patch
One-line guard around the
memcpy. Behavior is unchanged for non-empty inputs, and the return value matches the previous shape: astring_viewpointing atposwith length 0 for empty inputs.Repro
I hit this in a downstream PHP extension that wraps the library (iliaal/php_clickhouse). Inserting an empty value into a
LowCardinality(String)column under-fsanitize=address,undefinedreproduces it on every read path. Happy to add a unit test inut/if you'd prefer it gated bymake testrather than relying on existing coverage.