Skip to content

gh-149244: Support iterator inputs in covariance, correlation, and linear_regression#149245

Closed
htjworld wants to merge 1 commit intopython:mainfrom
htjworld:gh-149244-iterator-inputs-statistics
Closed

gh-149244: Support iterator inputs in covariance, correlation, and linear_regression#149245
htjworld wants to merge 1 commit intopython:mainfrom
htjworld:gh-149244-iterator-inputs-statistics

Conversation

@htjworld
Copy link
Copy Markdown
Contributor

@htjworld htjworld commented May 1, 2026

Closes #149244.

statistics.covariance(), statistics.correlation(), and
statistics.linear_regression() previously called len() directly on their
inputs, raising TypeError for iterators and generators. This was inconsistent
with the rest of the module — mean(), variance(), and stdev() all accept
any iterable.

The fix adds x = list(x) and y = list(y) at the start of each function.
This is in line with the internal _ss() helper, whose docstring states
"Calculations are done in a single pass, allowing the input to be an iterator."
The list conversion also correctly handles repeated iteration:
covariance() iterates each input twice (fsum() then sumprod()), and
linear_regression() already notes in a comment that x must be a list
"because used three times below."

Tests for iter() and generator expression inputs are added to
TestCorrelationAndCovariance and TestLinearRegression. The documentation
for all three functions is updated to reflect that sequences or iterables are
accepted.

@python-cla-bot
Copy link
Copy Markdown

python-cla-bot Bot commented May 1, 2026

All commit authors signed the Contributor License Agreement.

CLA signed

@read-the-docs-community
Copy link
Copy Markdown

Documentation build overview

📚 cpython-previews | 🛠️ Build #32501469 | 📁 Comparing e48aa88 against main (9668d26)

  🔍 Preview build  

3 files changed
± download.html
± library/statistics.html
± whatsnew/changelog.html

@rhettinger rhettinger self-assigned this May 1, 2026
@rhettinger rhettinger closed this May 1, 2026
@htjworld
Copy link
Copy Markdown
Contributor Author

htjworld commented May 2, 2026

Hi @rhettinger, could you share your thinking on closing this? I'd genuinely like to understand — whether it's the approach, a design decision I missed, or something else entirely.

@rhettinger
Copy link
Copy Markdown
Contributor

It is fine for these functions to just support sequences.

Also, running list(it) just slows down the common cases with an unnecessary copy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

statistics: covariance, correlation, and linear_regression do not accept iterator inputs

2 participants