Testing¶

Test Organization¶

Tests are organized by component:

tests/api/ — API server tests
tests/cli/ — CLI command tests
tests/core/ — Core module tests
tests/runner/ — Runner tests
tests/e2e/ — End-to-end tests (requires Minikube)
tests/smoke/ — Smoke tests against live environments

Running Tests¶

# Run all unit tests
pytest

# Run specific package tests (matches CI)
pytest tests/api -n auto -vv
pytest tests/cli -n auto -vv
pytest tests/core -n auto -vv
pytest tests/runner -n auto -vv

# Run E2E tests (requires running Minikube)
pytest --e2e -m e2e -vv

# Run smoke tests
./scripts/dev/smoke                          # current stack
./scripts/dev/smoke --stack dev-faber        # target a specific stack
./scripts/dev/smoke -k test_real_llm         # filter tests by name

Smoke Tests¶

Smoke tests validate a deployed environment by running real evals against real models.

cd hawk
hawk login
./scripts/dev/smoke                           # current stack, skip warehouse
./scripts/dev/smoke --stack staging            # target a specific stack
./scripts/dev/smoke --warehouse                # include warehouse checks

After updating dependencies:

pytest tests/smoke -m smoke --smoke -n 10 -vv

E2E Tests¶

E2E tests require a running Minikube cluster. The happy-path test runs a real eval against OpenAI:

# In your .env:
INSPECT_ACTION_API_RUNNER_SECRET_OPENAI_API_KEY=sk-...
INSPECT_ACTION_API_OPENAI_BASE_URL=https://api.openai.com/v1

Then run:

pytest --e2e -m e2e -vv

Testing Tools¶

Tool	Purpose
`pytest-xdist`	Parallel test execution (`-n auto`)
`pytest-asyncio`	Async test support (auto mode)
`pytest-mock`	General mocking
`pyfakefs`	Filesystem mocking
`moto`, `pytest-aioboto3`	AWS mocking
`testcontainers[postgres]`	PostgreSQL containers
`time-machine`	Time mocking

Code Quality Checks¶

Must pass before completion:

ruff check .                    # linting
ruff format . --check           # format check
basedpyright .                  # type checking