Skip to content

Contributing

Developer Setup

There are two ways to run Hawk locally:

Option A: Against staging (requires AWS access)

cp .env.staging .env
docker compose up --build

This connects to staging RDS, S3, and the staging Kubernetes cluster.

Option B: Fully local with Minikube (no AWS needed)

cp .env.local .env
./scripts/dev/start-minikube.sh

This uses local services only: MinIO for S3, local PostgreSQL, a local Docker registry, and Minikube for Kubernetes.

Then submit evals:

hawk eval-set examples/simple.eval-set.yaml

Run k9s to monitor the Inspect pod.

Full Dev Stack (API + Viewer + Live Reload)

For developing with hot reload across the full stack:

Terminal 1: Library Watch Mode

cd ~/inspect_ai/src/inspect_ai/_view/www
yarn install
yarn build:lib --watch
cd ~/inspect_scout/src/inspect_scout/_view/www
pnpm install
pnpm build:lib --watch

Terminal 2: Viewer Dev Server

Update www/package.json to point to your local library, then:

cd www
yarn install
VITE_API_BASE_URL=http://localhost:8080 yarn dev

Terminal 3: API Server

cp .env.staging .env  # Or .env.local for fully-local setup
set -a && source .env && set +a
uv run fastapi run hawk/api/server.py --port=8080 --host=0.0.0.0 --reload

Code Quality

ruff check       # lint
ruff format      # format
basedpyright     # type check
pytest           # unit tests

All code must pass basedpyright with zero errors and zero warnings.

Testing Runner Changes

Build and push a custom runner image:

./scripts/dev/build-and-push-runner-image.sh my-tag
hawk eval-set examples/simple.eval-set.yaml --image-tag my-tag

Local Minikube Setup

This runs Hawk entirely locally. It uses MinIO for S3, local PostgreSQL, a local Docker registry, and Minikube for Kubernetes. This is the same setup used by E2E tests in CI.

Prerequisites

You must be inside the devcontainer, which includes minikube, Docker-in-Docker, cilium, kubectl, helm, and gvisor.

Quick Start

cp .env.local .env
./scripts/dev/start-minikube.sh

The script will:

  1. Start Minikube with gvisor, containerd, and an insecure local registry
  2. Create Kubernetes resources and install Cilium
  3. Launch services (API server, MinIO, PostgreSQL, Docker registry)
  4. Run a smoke test to verify the cluster works
  5. Build and push a dummy runner image
  6. Run a simple eval set to verify everything works

Running Evals Locally

HAWK_API_URL=http://localhost:8080 hawk eval-set examples/simple.eval-set.yaml --image-tag=dummy

To run real evals, build and push a real runner image:

RUNNER_IMAGE_NAME=localhost:5000/runner ./scripts/dev/build-and-push-runner-image.sh latest

Updating Dependencies (Inspect AI / Inspect Scout)

Use the prepare-release.py script:

# Update to a specific PyPI version
./scripts/ops/prepare-release.py --inspect-ai 0.3.50

# Update to a specific git commit SHA
./scripts/ops/prepare-release.py --inspect-ai abc123def456

# Update Scout
./scripts/ops/prepare-release.py --inspect-scout 0.2.10

The script updates pyproject.toml files, runs uv lock, creates a release branch (for PyPI versions), and publishes any npm packages if needed.

Database Migrations

See Database for migration instructions.

Pull Requests

When creating PRs, use the template at .github/pull_request_template.md. The template includes:

  • Overview and linked issue
  • Approach and alternatives considered
  • Testing & validation checklist
  • Code quality checklist