Contributing¶

Developer Setup¶

There are two ways to run Hawk locally:

Option A: Against staging (requires AWS access)¶

cp .env.staging .env
docker compose up --build

This connects to staging RDS, S3, and the staging Kubernetes cluster.

Option B: Fully local with Minikube (no AWS needed)¶

cp .env.local .env
./scripts/dev/start-minikube.sh

This uses local services only: MinIO for S3, local PostgreSQL, a local Docker registry, and Minikube for Kubernetes.

Then submit evals:

hawk eval-set examples/simple.eval-set.yaml

Run k9s to monitor the Inspect pod.

Full Dev Stack (API + Viewer + Live Reload)¶

For developing with hot reload across the full stack:

Terminal 1: Library Watch Mode¶

Inspect AIInspect Scout

cd ~/inspect_ai/src/inspect_ai/_view/www
yarn install
yarn build:lib --watch

cd ~/inspect_scout/src/inspect_scout/_view/www
pnpm install
pnpm build:lib --watch

Terminal 2: Viewer Dev Server¶

Update www/package.json to point to your local library, then:

cd www
yarn install
VITE_API_BASE_URL=http://localhost:8080 yarn dev

Terminal 3: API Server¶

cp .env.staging .env  # Or .env.local for fully-local setup
set -a && source .env && set +a
uv run fastapi run hawk/api/server.py --port=8080 --host=0.0.0.0 --reload

Code Quality¶

ruff check       # lint
ruff format      # format
basedpyright     # type check
pytest           # unit tests

All code must pass basedpyright with zero errors and zero warnings.

Testing Runner Changes¶

Build and push a custom runner image:

./scripts/dev/build-and-push-runner-image.sh my-tag
hawk eval-set examples/simple.eval-set.yaml --image-tag my-tag

Local Minikube Setup¶

This runs Hawk entirely locally. It uses MinIO for S3, local PostgreSQL, a local Docker registry, and Minikube for Kubernetes. This is the same setup used by E2E tests in CI.

Prerequisites¶

You must be inside the devcontainer, which includes minikube, Docker-in-Docker, cilium, kubectl, helm, and gvisor.

Quick Start¶

cp .env.local .env
./scripts/dev/start-minikube.sh

The script will:

Start Minikube with gvisor, containerd, and an insecure local registry
Create Kubernetes resources and install Cilium
Launch services (API server, MinIO, PostgreSQL, Docker registry)
Run a smoke test to verify the cluster works
Build and push a dummy runner image
Run a simple eval set to verify everything works

Running Evals Locally¶

HAWK_API_URL=http://localhost:8080 hawk eval-set examples/simple.eval-set.yaml --image-tag=dummy

To run real evals, build and push a real runner image:

RUNNER_IMAGE_NAME=localhost:5000/runner ./scripts/dev/build-and-push-runner-image.sh latest

Updating Dependencies (Inspect AI / Inspect Scout)¶

Use the prepare-release.py script:

# Update to a specific PyPI version
./scripts/ops/prepare-release.py --inspect-ai 0.3.50

# Update to a specific git commit SHA
./scripts/ops/prepare-release.py --inspect-ai abc123def456

# Update Scout
./scripts/ops/prepare-release.py --inspect-scout 0.2.10

The script updates pyproject.toml files, runs uv lock, creates a release branch (for PyPI versions), and publishes any npm packages if needed.

Database Migrations¶

See Database for migration instructions.

Pull Requests¶

When creating PRs, use the template at .github/pull_request_template.md. The template includes:

Overview and linked issue
Approach and alternatives considered
Testing & validation checklist
Code quality checklist