Core API Reference¶

Core functions and classes for AI testing.

Functions¶

test()¶

def test(
    output: str,
    expected: Optional[str] = None,
    criteria: Optional[str] = None,
    **kwargs: Any,
) -> TestResult

Test any output using AI.

Parameters:

Parameter	Type	Default	Description
`output`	`str`	-	The output to test
`expected`	`str`	`None`	Expected output for comparison
`criteria`	`str`	`None`	Custom evaluation criteria

Returns: TestResult

Example:

from testagent import test

result = test("Paris", criteria="is a city name")
assert result.passed

accuracy()¶

def accuracy(
    output: str,
    expected: str,
    **kwargs: Any,
) -> TestResult

Test output accuracy against expected value.

Example:

from testagent import accuracy

result = accuracy("4", expected="4")

criteria()¶

def criteria(
    output: str,
    criteria: str,
    **kwargs: Any,
) -> TestResult

Test output against custom criteria.

Example:

from testagent import criteria

result = criteria("Hello!", criteria="is a greeting")

Classes¶

TestAgent¶

Main class for AI testing with configuration.

class TestAgent:
    def __init__(self, config: Optional[TestConfig] = None)
    def run(self, output: str, expected: str = None, criteria: str = None) -> TestResult
    async def run_async(self, output: str, expected: str = None, criteria: str = None) -> TestResult

Example:

from testagent import TestAgent, TestConfig

tester = TestAgent(config=TestConfig(model="gpt-4"))
result = tester.run("output", criteria="is correct")

TestResult¶

Result of an AI test.

Field	Type	Description
`score`	`float`	Score from 0 to 10
`passed`	`bool`	True if score ≥ threshold
`reasoning`	`str`	AI's explanation
`criteria`	`str`	Criteria used
`expected`	`str`	Expected output
`output`	`str`	Tested output
`duration`	`float`	Test duration

TestConfig¶

Configuration for AI testing.

Option	Type	Default	Description
`model`	`str`	`"gpt-4o-mini"`	LLM model
`threshold`	`float`	`7.0`	Pass threshold
`temperature`	`float`	`0.0`	LLM temperature
`verbose`	`bool`	`False`	Verbose output
`cache_enabled`	`bool`	`True`	Enable caching