Organized AI Organized AI · operator's manual

organized-gateway

// the Cloudflare Worker · Hono + KV + D1 + Tailscale proxy · operator's manual for just the bridge

What it is

organized-gateway is a single Cloudflare Worker that proxies POST /v1/* requests through a per-user rate limiter, a D1 access log, and a Tailscale bridge URL to an upstream service of your choice. Built on Hono. ~120 lines of TypeScript. One KV namespace, one D1 database, two secrets, one deploy.

This guide is the focused operator's manual for the Worker as a deliverable. For the broader system context (HICAM workshop, BYOK → Codex → Stripe phases, post-training paths), see openclaw-gateway-guide.

Project shape

organized-gateway/
  apps/
    organized-gateway/
      src/
        index.ts          // Hono app, ~120 lines
      migrations/
        0001_init.sql     // requests table + user_summary view
      wrangler.toml       // CF config — name, main, bindings
  scripts/
    bootstrap.sh          // provision KV + D1 + run migrations
    deploy.sh             // wrangler deploy convenience wrapper
  package.json            // pnpm + wrangler + hono + zod
  CLAUDE.md
  README.md

wrangler.toml

name = "organized-gateway"
main = "src/index.ts"
compatibility_date = "2026-01-01"
account_id = "691fe25d377abac03627d6a88d3eeac9"

[[kv_namespaces]]
binding = "GATEWAY_KV"
id = ""           # wrangler kv namespace create GATEWAY_KV → fill

[[d1_databases]]
binding = "DB"
database_name = "organized-gateway-db"
database_id = ""  # wrangler d1 create organized-gateway-db → fill

[vars]
ENVIRONMENT = "production"

# Set via wrangler secret put:
#   OPENCLAW_URL — Tailscale bridge URL or Cloudflare Tunnel URL
#   AUTH_MODE    — "openai" (BYOK) | "codex" (OAuth)

Secrets

secretpurposeset with
OPENCLAW_URLUpstream URL the Worker proxies to. Tailscale bridge or Cloudflare Tunnel pointing at your origin.echo "https://..." | wrangler secret put OPENCLAW_URL
AUTH_MODE"openai" = pass-through Authorization header. "codex" = resolve OAuth from KV by user id.echo "openai" | wrangler secret put AUTH_MODE

Bindings

bindingtypepurpose
GATEWAY_KVKVPer-user rate counters (rate:{user_id}:{minute}), Phase 2 OAuth tokens (oauth:{user_id}), Phase 3 tier flags (tier:{user_id}).
DBD1Access log (requests) + summary view (user_summary). Source of truth for usage analytics.
OPENCLAW_URLsecretUpstream proxy target.
AUTH_MODEsecretAuth dispatch: BYOK vs OAuth.

The Worker — full source

// apps/organized-gateway/src/index.ts
import { Hono } from 'hono'

type Env = {
  GATEWAY_KV: KVNamespace
  DB: D1Database
  OPENCLAW_URL: string
  AUTH_MODE: 'openai' | 'codex'
}

type Entry = {
  user_id: string
  endpoint: string
  status: number
  latency_ms: number
  tokens_est?: number
  ip_hash: string
}

const app = new Hono<{ Bindings: Env }>()

async function hashIp(ip: string): Promise<string> {
  const buf = await crypto.subtle.digest('SHA-256', new TextEncoder().encode(ip))
  return Array.from(new Uint8Array(buf))
    .slice(0, 4)
    .map((b) => b.toString(16).padStart(2, '0'))
    .join('')
}

async function logRequest(env: Env, e: Entry): Promise<void> {
  await env.DB.prepare(
    `INSERT INTO requests
       (user_id, endpoint, status, latency_ms, tokens_est, ip_hash, created_at)
     VALUES (?, ?, ?, ?, ?, ?, datetime('now'))`,
  ).bind(
    e.user_id, e.endpoint, e.status, e.latency_ms,
    e.tokens_est ?? 0, e.ip_hash,
  ).run()
}

async function checkRateLimit(env: Env, userId: string): Promise<boolean> {
  const minute = Math.floor(Date.now() / 60_000)
  const key = `rate:${userId}:${minute}`
  const current = parseInt((await env.GATEWAY_KV.get(key)) ?? '0', 10)
  if (current >= 50) return false
  await env.GATEWAY_KV.put(key, String(current + 1), { expirationTtl: 120 })
  return true
}

app.all('/v1/*', async (c) => {
  const start = Date.now()
  const userId = c.req.header('X-User-ID') ?? 'anonymous'
  const authHeader = c.req.header('Authorization') ?? ''
  const ipHash = await hashIp(c.req.header('CF-Connecting-IP') ?? '')
  const endpoint = new URL(c.req.url).pathname

  if (!(await checkRateLimit(c.env, userId))) {
    await logRequest(c.env, {
      user_id: userId, endpoint, status: 429, latency_ms: 0, ip_hash: ipHash,
    })
    return c.json({ error: 'Rate limit exceeded. 50 requests/minute.' }, 429)
  }

  const headers: Record<string, string> = {
    'Content-Type': 'application/json',
    'X-Forwarded-User': userId,
  }
  if (c.env.AUTH_MODE === 'codex') {
    const token = await c.env.GATEWAY_KV.get(`oauth:${userId}`)
    if (!token) return c.json({ error: 'No Codex session.' }, 401)
    headers['X-Codex-Token'] = token
  } else {
    headers['Authorization'] = authHeader
  }

  const upstream = await fetch(`${c.env.OPENCLAW_URL}${endpoint}`, {
    method: c.req.method,
    headers,
    body: c.req.method !== 'GET' ? await c.req.text() : undefined,
  })

  const body = await upstream.text()
  let tokens = 0
  try { tokens = JSON.parse(body)?.usage?.total_tokens ?? 0 } catch {}

  await logRequest(c.env, {
    user_id: userId, endpoint, status: upstream.status,
    latency_ms: Date.now() - start, tokens_est: tokens, ip_hash: ipHash,
  })

  return new Response(body, {
    status: upstream.status,
    headers: { 'Content-Type': 'application/json' },
  })
})

app.get('/health', (c) =>
  c.json({ status: 'ok', gateway: 'organized-gateway' }),
)

export default app

Request flow

   client                    organized-gateway                     OPENCLAW_URL
   ──────                    ─────────────────                     ────────────

   POST /v1/chat/...
   X-User-ID: u
   Authorization: Bearer sk-…
        │
        ▼
                              extract user_id, auth, ip
                                       │
                                       ▼
                              KV  rate:{u}:{m}  ≥ 50 ?  ──── 429 ─►  client
                                       │ no
                                       ▼
                              D1  INSERT requests (start)
                                       │
                                       ▼
                              build upstream headers
                              (Authorization OR X-Codex-Token)
                                       │
                                       ▼
                              fetch  ──────────────────────────►   OPENCLAW
                                                                   :18789
                                       ◄──────────────────────────  response
                                       │
                                       ▼
                              D1  INSERT requests (status, latency, tokens)
                                       │
                                       ▼
                              return Response(body, status)

Endpoints

method · pathheaders inbehavior
* /v1/*X-User-ID (req), Authorization (Phase 1), CF-Connecting-IPRate-limit → log → proxy → log → return upstream body verbatim with its status code.
GET /healthnoneLiveness. Returns {"status":"ok","gateway":"organized-gateway"}.

Deploy — three commands

1. Bootstrap (once)

CLOUDFLARE_ACCOUNT_ID=691fe25d377abac03627d6a88d3eeac9 \
  bash scripts/bootstrap.sh

# What it creates:
#   wrangler kv namespace create GATEWAY_KV    → write id into wrangler.toml
#   wrangler d1 create organized-gateway-db    → write database_id into wrangler.toml
#   wrangler d1 execute organized-gateway-db --file=migrations/0001_init.sql

2. Set secrets

echo "https://your-tailscale-bridge" | wrangler secret put OPENCLAW_URL
echo "openai" | wrangler secret put AUTH_MODE

3. Deploy

CLOUDFLARE_ACCOUNT_ID=691fe25d377abac03627d6a88d3eeac9 \
  wrangler deploy \
    --name organized-gateway \
    --config apps/organized-gateway/wrangler.toml \
    --commit-dirty=true

Smoke test

# 1. health
curl https://organized-gateway.<account-subdomain>.workers.dev/health
# → {"status":"ok","gateway":"organized-gateway"}

# 2. one real request through the full pipe
curl -X POST https://organized-gateway.<account-subdomain>.workers.dev/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-User-ID: smoke-test" \
  -H "Authorization: Bearer sk-your-key" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'

# 3. confirm row landed in D1
wrangler d1 execute organized-gateway-db \
  --command "SELECT * FROM requests ORDER BY id DESC LIMIT 1"

Live monitoring

Tail logs

wrangler tail organized-gateway --format=pretty
# filter by status: --status=error
# sample rate:    --sampling-rate=0.1

Per-user activity (last 5 minutes)

watch -n5 'wrangler d1 execute organized-gateway-db \
  --command "SELECT user_id, count(*) as n, avg(latency_ms) as ms
             FROM requests
             WHERE created_at > datetime(\"now\",\"-5 minutes\")
             GROUP BY user_id ORDER BY n DESC"'

Rate-limit hits

wrangler d1 execute organized-gateway-db \
  --command "SELECT user_id, count(*) FROM requests
             WHERE status = 429 GROUP BY user_id ORDER BY 2 DESC"

Scaling beyond 50 concurrent

knobchangeeffect
per-user rate50 → 30Halves the M4 Mini load ceiling. Safe for 75–100 concurrent users.
KV TTL120s → 180sTolerates briefly higher KV write contention; minor staleness.
upstreamHetzner CX41/AX41Swap OPENCLAW_URL to a bigger origin; no Worker change.
D1 archivalweekly to R2Avoid the 5 M reads/day free-tier cap on long-running events.
multi-region D1read replicasLatency improvement for global users; not needed for a single-event setup.

TypeScript types

# Generate the Env interface from wrangler.toml bindings
wrangler types
# → writes worker-configuration.d.ts

# Re-run after editing wrangler.toml — types drift otherwise

Pitfalls