fix: handle BaseException in trace spans to prevent span leaks on Key…#2068
fix: handle BaseException in trace spans to prevent span leaks on Key…#2068Di-Is wants to merge 3 commits intostrands-agents:mainfrom
Conversation
…boardInterrupt Trace spans were not properly closed when BaseException (e.g. KeyboardInterrupt, asyncio.CancelledError) was raised. Add explicit BaseException handlers to close spans and aclose() calls to ensure async generators are cleaned up.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| ): | ||
| yield event | ||
| ) | ||
| try: |
There was a problem hiding this comment.
nit: would it be better to move try on top of streamed_events = stream_messages(...)?
| raise | ||
| except BaseException as e: | ||
| tracer.end_span_with_error(cycle_span, str(e), e) | ||
| raise |
There was a problem hiding this comment.
Issue: The except Exception and except BaseException handlers here have identical bodies — both call tracer.end_span_with_error and re-raise. Since Exception is a subclass of BaseException, the except BaseException handler alone would catch both.
Suggestion: Consolidate into a single handler:
except BaseException as e:
tracer.end_span_with_error(cycle_span, str(e), e)
raiseNote: The other two locations (lines 233-250 and 398-430) are correctly separated since the except Exception handlers do additional work (wrapping in EventLoopException, retry logic, etc.) that should not apply to BaseException subclasses.
| async for event in streamed_events: | ||
| yield event | ||
| finally: | ||
| await streamed_events.aclose() |
There was a problem hiding this comment.
Issue: streamed_events is initialized to None (line 336), but the finally block unconditionally calls await streamed_events.aclose(). If stream_messages() raises before returning the generator object (e.g., due to a future signature change or dynamic error), this would raise AttributeError: 'NoneType' object has no attribute 'aclose' — masking the original exception.
Currently this is safe because stream_messages is an async generator function (calling it just creates the object without executing any code), but the code is fragile.
Suggestion: Add a guard:
finally:
if streamed_events is not None:
await streamed_events.aclose()The same pattern at line 162 for model_events has the same concern, though it's slightly safer because model_events is assigned before the try block.
|
Assessment: Comment Solid bug fix that closes a real span-leak gap for Review Details
|
|
@Di-Is can you also merge from main? there are some conflicts |
|
The conflict has been resolved. |
Description
PR #2032 switched spans from
end_on_exit=Trueto explicitspan.end()calls, fixing span timing. However, the existingexcept Exceptionhandlers don't catchBaseExceptionsubclasses likeKeyboardInterruptandasyncio.CancelledError, so spans leak when these occur.This PR ensures spans are properly closed on all exit paths, including
BaseExceptioninterruptions and premature async generator teardown.Follow-up to #2032 (which resolved #1930), split from #1939 per reviewer request.
Public API Changes
No public API changes. Internal tracer method signatures are widened from
ExceptiontoBaseException(backward-compatible).Related Issues
Documentation PR
N/A
Type of Change
Bug fix
Testing
hatch run prepareChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.