Skip to content

static_instruction + instruction pattern for context caching producing a permanently unstable fingerprintΒ #6216

Description

@Jonasb8

πŸ”΄ Required Information

Describe the Bug:

GeminiContextCacheManager contains a method _find_count_of_contents_to_cache specifically designed to exclude the dynamic instruction_provider content from the cache fingerprint β€” but it is never called. As a result, using the documented static_instruction + instruction pattern for context caching produces a permanently unstable fingerprint, making the cache never hit.

Root cause (code walkthrough):

static_instruction + instruction is the ADK-recommended pattern for context caching: the static part goes to system_instruction (stable, fingerprinted), while the instruction provider result is appended to llm_request.contents as a user-role Content (dynamic, should be excluded from the fingerprint).

The intended mechanism for excluding this dynamic content exists in gemini_context_cache_manager.py:

def _find_count_of_contents_to_cache(self, contents):
    """Find the number of contents to cache based on user content strategy.
    Strategy: Find the last continuous batch of user contents and cache
    all contents before them.
    """
    last_user_batch_start = len(contents)
    for i in range(len(contents) - 1, -1, -1):
        if contents[i].role == "user":
            last_user_batch_start = i
        else:
            break
    return last_user_batch_start

At turn 1, with instruction_provider appending a user-role block at the end, all contents are user-role β†’ this function returns N=0 β†’ fingerprint = hash(system_instruction + tools) only β†’ stable across all turns.

However, in handle_context_caching, the actual fingerprint count is computed as:

# No existing cache metadata - return fingerprint-only metadata
total_contents_count = len(llm_request.contents)  # ← bug: should use _find_count_of_contents_to_cache
fingerprint = self._generate_cache_fingerprint(llm_request, total_contents_count)
return CacheMetadata(fingerprint=fingerprint, contents_count=total_contents_count)

_find_count_of_contents_to_cache is defined but never called anywhere in the codebase.

Why this breaks turn-by-turn:

  • Turn 1 contents (after instruction_provider appends): [user_msg_1, dynamic_ctx_t1] β†’ N=2, fingerprint covers both
  • Turn 2 contents (first N=2): [user_msg_1, model_resp_1] β€” model response now occupies the slot where dynamic_ctx_t1 was
  • Fingerprint mismatch β†’ N reset to 4 (total contents) β†’ same problem repeats every turn
  • Cache is never created

Steps to Reproduce:

  1. Create an LlmAgent with static_instruction (stable string) and instruction (dynamic provider returning session-dependent content)
  2. Enable ContextCacheConfig on the App
  3. Run a multi-turn conversation
  4. Enable GOOGLE_ADK_LOG_LEVEL=DEBUG and observe logs

Expected Behavior:

The instruction_provider content (user-role, appended at end of contents) is excluded from the cache fingerprint. The fingerprint covers only system_instruction + tools, which is stable across turns. The cache is created on turn 2 and reused on subsequent turns as long as system_instruction and tools do not change.

Observed Behavior:

The fingerprint includes the instruction_provider content (via len(llm_request.contents)). Since that content changes each turn (or is displaced by the model's response in the first-N window), the fingerprint changes on every turn. Debug logs show:

Cache content fingerprint mismatch
Fingerprints don't match, returning fingerprint-only metadata

The cache is never created. cache_hit_pct = 0%.

Proposed Fix:

In handle_context_caching, replace len(llm_request.contents) with the existing (but uncalled) _find_count_of_contents_to_cache:

# Before (buggy):
total_contents_count = len(llm_request.contents)

# After (fix):
total_contents_count = self._find_count_of_contents_to_cache(llm_request.contents)

This aligns the implementation with the documented static_instruction + instruction pattern and with the evident design intent of _find_count_of_contents_to_cache.

Environment Details:

  • ADK Library Version: google-adk==1.32.0
  • Desktop OS: macOS (Darwin 24.6.0)
  • Python Version: 3.13.11

Model Information:

  • LiteLLM: No
  • Model: gemini-2.0-flash-lite (Gemini API)

🟑 Optional Information

Minimal Reproduction Code:

from google.adk.agents import LlmAgent
from google.adk.apps.app import App
from google.adk.agents.context_cache_config import ContextCacheConfig
from google.adk.agents.readonly_context import ReadonlyContext
from google.adk.models import Gemini

_STATIC_PROMPT = "You are a helpful assistant. " * 300  # large enough to exceed 4096 tokens with tools

def dynamic_instruction(context: ReadonlyContext) -> str:
    # Simulates per-turn dynamic content (e.g. session state)
    return f"<session_state>turn_data={context.state.get('turn', 0)}</session_state>"

agent = LlmAgent(
    name="test_agent",
    model=Gemini(model="gemini-2.0-flash-lite"),
    static_instruction=_STATIC_PROMPT,
    instruction=dynamic_instruction,
)

app = App(
    name="test",
    root_agent=agent,
    context_cache_config=ContextCacheConfig(ttl_seconds=1800, min_tokens=4096),
)
# Run multi-turn: observe "fingerprint mismatch" in DEBUG logs on every turn

How often has this issue occurred?: Always (100%)

Additional Context:

The workaround is to inject dynamic content via a before_model_callback that calls llm_request.contents.insert(0, ...) instead of using instruction_provider. Because the dynamic block is then at position 0 on every turn, the first-N fingerprint window consistently starts with it, and the fingerprint is stable as long as the dynamic content itself doesn't change. This is semantically equivalent to the intended instruction_provider behavior but should not be necessary.

Metadata

Metadata

Assignees

Labels

core[Component] This issue is related to the core interface and implementationrequest clarification[Status] The maintainer need clarification or more information from the author

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions