feat(conversation_manager): add token-aware context management by srbhsrkr · Pull Request #2147 · strands-agents/sdk-python

srbhsrkr · 2026-04-17T18:05:37Z

Summary

Closes #2146
Addresses #1294, #555, #298
Related to #1295, #1678, #1296, #2048

Add _token_utils.py with estimate_tokens (chars/4 heuristic) covering all ContentBlock types and TokenCounter type alias for pluggable token counting ([CONTEXT] [FEATURE] Token Estimation API #1294)
Add max_context_tokens, token_counter, and compactable_after_messages to SlidingWindowConversationManager for token-budget enforcement and micro-compaction of stale tool results ([FEATURE] Proactive Context Compression #555, [FEATURE] In-envent-loop cycle context management #298)
Add max_context_tokens, proactive_threshold, and token_counter to SummarizingConversationManager for proactive summarization when token threshold is exceeded ([FEATURE] Proactive Context Compression #555)
Always use heuristic estimator (never stale model-reported latest_context_size) to prevent over-reduction spirals
Hook calls apply_management() (not reduce_context() directly) to ensure micro-compaction runs before trimming

How this relates to existing issues

Issue	Status
#1294 — Token Estimation API	Addressed: `estimate_tokens()` + `TokenCounter` type on conversation managers. Complementary to a future `Model.estimate_tokens()` — ours is the lightweight heuristic, theirs would be model-specific.
#555 — Proactive Context Compression	Addressed: `max_context_tokens` + `BeforeModelCallEvent` hook triggers reduction before `ContextWindowOverflowException`.
#298 — In-event-loop cycle context management	Addressed: `per_turn` + `compactable_after_messages` + hook-based token budget checks enable within-cycle management.
#1295 — Context Limit Property on Model	Complementary: if `model.context_limit` ships, it could auto-configure `max_context_tokens`.
#1678, #1296 — Large Content Aliasing/Externalization	Related: micro-compaction replaces stale results with stubs (different strategy, same goal).
#2048 — Expose reduce_context() as Hook Event	Related: our hook calls `apply_management()` → `reduce_context()`, but doesn't fire a dedicated event.

Design Notes

_model_call_count only increments when per_turn is enabled (preserves existing per-turn semantics)
Summarizing manager's apply_management is an intentional no-op — proactive summarization runs exclusively via hook to prevent double-summarization with the agent's finally block
_last_compacted_index tracks compaction progress to avoid re-scanning already-processed messages
Hook registration in summarizing manager is guarded by max_context_tokens is not None

Test plan

55 new tests in test_token_aware_context_management.py covering:
- Token estimation for all block types (text, toolResult, toolUse, image, document, video, cachePoint, guardContent)
- Token budget enforcement via apply_management and BeforeModelCallEvent hook
- Micro-compaction: replace/preserve/skip-already-processed/image-blocks
- Parameter validation (max_context_tokens, compactable_after_messages)
- _model_call_count semantics regression (not incremented when per_turn=False)
- Integration: hook → apply_management → reduce_context full pipeline
- _last_compacted_index adjustment after message trimming
All 73 existing conversation manager tests pass (no regressions)
Lint clean (ruff check), type clean (mypy)

Add token-budget awareness to SlidingWindowConversationManager and SummarizingConversationManager so context reduction can be driven by estimated token counts, not just message counts. Key changes: - New `_token_utils.py` with `estimate_tokens` (chars/4 heuristic) and `TokenCounter` type alias, handling all ContentBlock types (text, toolResult, toolUse, image, document, video, reasoningContent, etc.) - `SlidingWindowConversationManager`: new `max_context_tokens`, `token_counter`, and `compactable_after_messages` parameters; proactive token-budget enforcement via BeforeModelCallEvent hook; micro-compaction of stale tool results with `_last_compacted_index` tracking - `SummarizingConversationManager`: new `max_context_tokens`, `proactive_threshold`, and `token_counter` parameters; proactive summarization via hook when token threshold exceeded - Always uses heuristic estimator (never stale model-reported `latest_context_size`) to prevent over-reduction spirals - 55 new tests covering token estimation, budget enforcement, micro-compaction, parameter validation, integration flows, and _model_call_count semantics regression Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@tool

Add read_only, destructive, and requires_confirmation boolean parameters to the @tool decorator and corresponding properties on AgentTool, ToolSpec, and MCPAgentTool. This enables hook-based permission policies to reason about tool safety without hardcoding tool-name mappings. Closes strands-agents#2154 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…mmarization - Loop reduce_context in apply_management until token budget is satisfied or no further progress can be made, fixing cases where a single reduce_context call was insufficient for large messages under window_size - Prevent double apply_management calls in _on_before_model_call by unifying token-budget and per_turn triggers into a single dispatch - Fix _micro_compact image reclaimed accounting to use IMAGE_CHAR_ESTIMATE instead of hardcoded 200, and subtract stub length from reclaimed_chars - Add _do_proactive_summarization guard in SummarizingConversationManager to prevent hook and apply_management from both triggering summarization in the same agent cycle - Make SummarizingConversationManager.apply_management honor the token budget contract instead of being a silent no-op - Rename _IMAGE_CHAR_ESTIMATE to IMAGE_CHAR_ESTIMATE (cross-module usage) - Use len(messages) as loop bound instead of fixed constant Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions Bot added the size/xl label Apr 17, 2026

srbhsrkr requested a deployment to manual-approval April 17, 2026 18:05 — with GitHub Actions Waiting

srbhsrkr and others added 2 commits April 21, 2026 10:34

github-actions Bot added size/xl and removed size/xl labels Apr 21, 2026

srbhsrkr requested a deployment to manual-approval April 21, 2026 17:35 — with GitHub Actions Waiting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(conversation_manager): add token-aware context management#2147

feat(conversation_manager): add token-aware context management#2147
srbhsrkr wants to merge 3 commits intostrands-agents:mainfrom
srbhsrkr:feat/token-aware-context-management

srbhsrkr commented Apr 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

srbhsrkr commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How this relates to existing issues

Design Notes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

srbhsrkr commented Apr 17, 2026 •

edited

Loading