Securing Ollama with API Key Authentication Using Caddy Reverse Proxy

If you're running Ollama with port forwarding (like I described in my RTX 3090 guide), anyone who discovers your IP can use your GPU for free. Ollama has zero built-in authentication. Here's how I fixed that.

Server security and authentication concept

The Problem

Default Ollama setup:

Internet → Router:11434 → Ollama:11434 (NO AUTH)

Anyone can:

curl http://your-ip:11434/api/generate -d '{"model":"qwen3:30b","prompt":"free GPU!"}'

This means:

Free compute for strangers — They use your electricity and GPU
Model abuse — Generate harmful content on your hardware
DoS risk — Overload your GPU with requests

The Solution: Caddy as Auth Proxy

Caddy is a modern web server with automatic HTTPS. It's simpler than Nginx and perfect for this use case.

Architecture after setup:

Internet → Router:11434 → Caddy:11435 (checks X-API-Key) → Ollama:11434
                              ↓ No key?
                          401 Unauthorized

Step-by-Step Setup (Windows)

1. Download Caddy

Get the Windows binary from caddyserver.com/download.

Place caddy.exe in a permanent location like C:\Tools\Caddy\.

2. Create the Caddyfile

Create C:\Tools\Caddy\Caddyfile:

:11435 {
    # Health check (no auth needed)
    handle /health {
        respond "OK" 200
    }

    # API key validation
    @valid_key {
        header X-API-Key "your-secret-key-here-make-it-long-and-random"
    }

    # Authenticated requests → proxy to Ollama
    handle @valid_key {
        reverse_proxy localhost:11434 {
            header_up Host {http.request.host}
            header_up X-Real-IP {http.request.remote.host}
        }
    }

    # Unauthorized requests
    handle {
        respond "Unauthorized" 401
    }

    # Logging
    log {
        output file C:\Tools\Caddy\access.log
        format json
    }
}

3. Generate a Strong API Key

# On Linux/Mac
openssl rand -hex 32
# Output: a1b2c3d4e5f6...64 characters

# Or just use a password generator — aim for 32+ characters

4. Update Router Port Forwarding

Change from:

External 11434 → Your PC:11434

To:

External 11434 → Your PC:11435 (Caddy's port)

The key insight: external port stays the same so your apps don't need changes. Only the internal destination changes to Caddy.

5. Start Caddy

cd C:\Tools\Caddy
.\caddy.exe run --config Caddyfile

6. Test

# No key → 401
curl http://your-public-ip:11434/api/tags
# Unauthorized

# Wrong key → 401
curl -H "X-API-Key: wrong" http://your-public-ip:11434/api/tags
# Unauthorized

# Correct key → 200
curl -H "X-API-Key: your-secret-key" http://your-public-ip:11434/api/tags
# {"models":[{"name":"qwen3:30b",...}]}

Handling CORS (For Browser Apps)

If your frontend calls Ollama directly from the browser, you need CORS headers. Update the Caddyfile:

:11435 {
    # CORS preflight
    @cors_preflight method OPTIONS
    handle @cors_preflight {
        header Access-Control-Allow-Origin "*"
        header Access-Control-Allow-Methods "GET, POST, OPTIONS"
        header Access-Control-Allow-Headers "Content-Type, X-API-Key"
        header Access-Control-Max-Age "86400"
        respond "" 204
    }

    @valid_key header X-API-Key "your-secret-key"

    handle @valid_key {
        header Access-Control-Allow-Origin "*"
        reverse_proxy localhost:11434
    }

    handle {
        respond "Unauthorized" 401
    }
}

⚠️ Security note: In production, replace * with your specific domain (https://sysofti.com).

Integration with Vercel (Next.js)

For server-side API routes (recommended — don't expose your key to browsers):

1. Set Environment Variable

vercel env add OLLAMA_API_KEY
# Enter your secret key

2. Server-Side API Route

// src/app/api/chat/route.ts
export async function POST(request: Request) {
  const { messages } = await request.json()

  const response = await fetch('http://your-ip:11434/api/chat', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'X-API-Key': process.env.OLLAMA_API_KEY!,
    },
    body: JSON.stringify({
      model: 'qwen3:30b',
      messages,
      stream: false,
    }),
  })

  const data = await response.json()
  return Response.json(data)
}

The API key stays server-side. Browsers call your Vercel route, which proxies to Ollama with the key.

Running Caddy as a Windows Service

Don't want to keep a terminal open? Install as a service:

# Run PowerShell as Administrator
sc.exe create CaddyProxy `
  binPath= "C:\Tools\Caddy\caddy.exe run --config C:\Tools\Caddy\Caddyfile" `
  start= auto `
  DisplayName= "Caddy Ollama Proxy"

sc.exe start CaddyProxy

Now Caddy starts automatically on boot.

Monitoring & Logs

Check who's trying to access your API:

# View recent access logs
Get-Content C:\Tools\Caddy\access.log -Tail 20 | ConvertFrom-Json | Format-Table

Look for repeated 401s from the same IP — that's someone probing your endpoint.

Advanced: Rate Limiting

Add rate limiting to prevent abuse even with valid keys:

:11435 {
    @valid_key header X-API-Key "your-secret-key"

    handle @valid_key {
        rate_limit {
            zone dynamic_zone {
                key {http.request.remote.host}
                events 30
                window 1m
            }
        }
        reverse_proxy localhost:11434
    }

    handle {
        respond "Unauthorized" 401
    }
}

This limits to 30 requests per minute per IP.

Alternative: Nginx

If you prefer Nginx (common on Linux):

server {
    listen 11435;

    location / {
        if ($http_x_api_key != "your-secret-key") {
            return 401 "Unauthorized";
        }

        proxy_pass http://127.0.0.1:11434;
        proxy_set_header Host $host;
        proxy_read_timeout 600s;  # Long timeout for LLM generation
    }
}

I chose Caddy because it's a single binary with no dependencies — much simpler on Windows.

Verification Checklist

After setup, verify:

curl http://your-ip:11434/api/tags returns 401
curl -H "X-API-Key: YOUR_KEY" http://your-ip:11434/api/tags returns models
Your web app works with the key in env vars
Caddy starts on boot (Windows service)
Logs are recording access attempts

Conclusion

Caddy + API key authentication is the simplest way to secure a public Ollama endpoint. The setup takes 10 minutes and eliminates the biggest risk of self-hosted LLMs: unauthorized access.

The combination of Ollama (free inference) + Caddy (free security) + port forwarding (free remote access) gives you a production-grade LLM API at zero ongoing cost.

This is part of my series on self-hosted AI infrastructure. See also: Running Qwen 3 on RTX 3090 and Ollama vs ChatGPT comparison.