Caching¶
Cache LLM responses to reduce costs and improve speed.
graph LR
subgraph "Caching Flow"
A[Request] --> B{Cache?}
B -->|Hit| C[Return]
B -->|Miss| D[LLM]
D --> E[Store]
E --> C
end
classDef request fill:#6366F1,stroke:#7C90A0,color:#fff
classDef cache fill:#F59E0B,stroke:#7C90A0,color:#fff
classDef result fill:#10B981,stroke:#7C90A0,color:#fff
class A request
class B,E cache
class C,D result
Enable Caching¶
Caching is enabled by default:
Cache Location¶
Default: .testagent_cache/
CLI Commands¶
Clear Cache¶
View Statistics¶
Output:
TestAgentCache Class¶
Direct cache access:
from testagent import TestAgentCache, CacheKey
cache = TestAgentCache()
# Create cache key
key = CacheKey(
output="test output",
criteria="is correct",
model="gpt-4o-mini"
)
# Get/set
result = cache.get(key)
cache.set(key, result)
# Statistics
stats = cache.stats()
print(f"Entries: {stats['entries']}")
# Clear
cache.clear()
CacheKey¶
| Field | Type | Description |
|---|---|---|
output |
str |
Tested output |
criteria |
str |
Evaluation criteria |
expected |
str |
Expected output |
model |
str |
LLM model used |
Best Practices¶
When to Cache
- Repeated tests with same inputs
- CI/CD pipelines
- Development iterations
When Not to Cache
- Testing non-deterministic outputs
- When you need fresh evaluations