Skip to content

MinimalFilter: single-line Python docstring ("""...""") leaves in_docstring=true, suppressing subsequent comment stripping #1322

@shridipavansaber

Description

@shridipavansaber

Description

When a Python line both opens and closes a docstring on the same line (e.g. """Short docstring."""), MinimalFilter flips in_docstring to true and emits the line but never processes the closing """ on that same line. Every line after it is treated as inside a docstring until the next """ line appears, meaning # comments that should be stripped are instead preserved.

Reproduction

#[test]
fn test_minimal_python_docstring_toggle() {
    let filter = MinimalFilter;
    let input = r#"def foo():
    """Short docstring."""
    # this comment should be stripped
    x = 1
    # another comment to strip
    return x
"#;
    let result = filter.filter(input, &Language::Python);
    // Bug: in_docstring is left true after the single-line docstring,
    // so the # comments below it are preserved instead of being stripped.
    assert!(
        !result.contains("# this comment should be stripped"),
        "single-line docstring should not leave in_docstring=true"
    );
    assert!(
        !result.contains("# another comment to strip"),
        "comment after single-line docstring should be stripped"
    );
}

Running this test exposes the bug: both # this comment should be stripped and # another comment to strip appear in the output because in_docstring is true after the """Short docstring.""" line, so the # ... lines hit the if in_docstring { keep line } branch rather than the comment-stripping branch.

Root cause

In MinimalFilter::filter (src/core/filter.rs), the Python docstring branch is:

if *lang == Language::Python && trimmed.starts_with("\"\"\"") {
    in_docstring = !in_docstring;
    result.push_str(line);
    result.push('\n');
    continue;
}

This unconditionally toggles in_docstring whenever a line starts with """. For a single-line docstring like """Short docstring.""", the opening """ flips the flag to true, the line is emitted, and execution moves on. The closing """ on the same line is never seen because the code already called continue. Every subsequent line is then considered inside a docstring.

The fix should check whether the same line also contains a closing """ after the opening one (i.e. the docstring both opens and closes on one line) and, if so, keep in_docstring at false.

Notes

Found while running tailtest on this codebase. The generated test suite documents the current (buggy) behavior -- the assertions will fail once the bug is fixed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions