Skip to content

Quick Start

This gets you from zero to a working Hawk deployment on AWS. You'll need an AWS account and a domain name. You can use your existing OIDC identity provider for authentication, or a Cognito user pool by default.

1. Install prerequisites

brew install pulumi awscli uv python@3.13 jq

Or on Linux, install Pulumi, uv, the AWS CLI, Python 3.13+, and jq.

2. Clone the repo

git clone https://github.com/METR/hawk-preview.git
cd hawk-preview

3. Set up Pulumi state backend

aws configure  # or: aws sso login --profile your-profile

Create an S3 bucket and KMS key for Pulumi state:

aws s3 mb s3://my-pulumi-state
aws kms create-alias --alias-name alias/pulumi-secrets \
  --target-key-id $(aws kms create-key --query KeyMetadata.KeyId --output text)

Log in to the S3 backend:

pulumi login s3://my-pulumi-state

4. Create and configure your stack

cd infra
pulumi stack init my-org --secrets-provider="awskms://alias/pulumi-secrets"
cp ../Pulumi.example.yaml ../Pulumi.my-org.yaml

Edit Pulumi.my-org.yaml with your values. At minimum, you need:

config:
  aws:region: us-west-2
  hawk:domain: hawk.example.com
  hawk:publicDomain: example.com
  hawk:primarySubnetCidr: "10.0.0.0/16"
  hawk:createPublicZone: "true"

That's enough to get started. The environment name defaults to your stack name. Hawk will create a Cognito user pool for authentication automatically.

If you already have an OIDC provider (Okta, Auth0, etc.), you can use it instead:

# Optional: use your own OIDC provider instead of Cognito
hawk:oidcClientId: "your-client-id"
hawk:oidcAudience: "your-audience"
hawk:oidcIssuer: "https://login.example.com/oauth2/default"

5. Deploy

pulumi up

This creates roughly 200+ AWS resources including a VPC, EKS cluster, ALB, ECS services, Aurora PostgreSQL, S3 buckets, Lambda functions, and more. First deploy takes about 15-20 minutes.

6. Set up LLM API keys

Hawk routes model API calls through its built-in LLM proxy (Middleman). You need to provide at least one provider's API key:

cd hawk
./scripts/dev/set-api-keys.sh <env> OPENAI_API_KEY=sk-...

This stores the key in Secrets Manager and restarts Middleman. You can set multiple keys at once:

./scripts/dev/set-api-keys.sh <env> OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-...

Replace <env> with your hawk:env value (e.g., production). Supported providers: OpenAI, Anthropic, Gemini, DeepInfra, DeepSeek, Fireworks, Mistral, OpenRouter, Together, xAI.

7. Create a user (Cognito only)

If you're using the default Cognito authentication, create a user:

cd hawk
./scripts/dev/create-cognito-user.sh <stack> you@example.com

The script reads the Cognito user pool from your Pulumi stack outputs, creates the user, and prints the login credentials. Skip this step if you're using your own OIDC provider.

8. Install the Hawk CLI and run your first eval

uv pip install "hawk[cli] @ git+https://github.com/METR/hawk-preview#subdirectory=hawk"

# Configure the CLI to point to your deployment
uv run python scripts/dev/generate-env.py <stack> > ~/.config/hawk-cli/env

hawk login
hawk eval-set hawk/examples/simple.eval-set.yaml
hawk logs -f   # watch it run
hawk web       # open results in browser