Skip to content

Cloudflare AI Specification

Overview

Cloudflare AI is split into two Workers: - ai-runner: orchestration and context loading - ai-agent: direct model execution and prompt management

This split is intentional. Runner loads context once and fans out work; agent owns prompt-backed model calls and some admin prompt APIs.

Worker Inventory

cloudflare/ai-runner

Purpose: - orchestration entrypoint for extraction, no-intent, yes-intent, and response paths - shared context loading from Supabase - proxy/admin route pass-through to ai-agent

Key routes found in code: - /extract - /no-intent - /yes-intent - /response - /response-health - /admin/prompts/meta - /admin/prompts/read - /admin/prompts/update

cloudflare/ai-agent

Purpose: - direct prompt/model execution against Cloudflare AI - prompt KV read/write APIs - worker endpoints for extracted data, intent, and response generation

Key routes found in code: - /summary - /vehicle - /lead - /response - /intent - /no-intent - /admin/prompts/meta - /admin/prompts/read - /admin/prompts/update

Prompt System

  • Prompt text is loaded from the SYSTEM_PROMPTS KV binding
  • Canonical prompt keys include:
  • prompt-convo-responder
  • prompt-vehicle-data
  • prompt-lead-data
  • prompt-convo-intent
  • prompt-convo-no
  • prompt-convo-yes
  • prompt-lead-evaluator
  • Supabase admin-ai-settings stores prompt drafts and Cloudflare KV stores the published runtime prompts
  • AdminAiSettings.tsx is the actual admin UI for draft/save/publish behavior

Model

  • ai-agent currently targets @cf/meta/llama-4-scout-17b-16e-instruct

Main Flows

Extraction flow

  1. Caller sends thread_id and lead_id to ai-runner /extract
  2. Runner fetches context from Supabase
  3. Runner calls ai-agent /vehicle and /lead in parallel
  4. Runner then calls ai-agent /summary
  5. Results are written back through Supabase callback/update endpoints

No-intent flow

  1. A Supabase function or internal caller posts to ai-runner /no-intent
  2. Runner builds the recent conversation window
  3. Runner calls ai-agent /no-intent
  4. Supabase closeout behavior is applied by ai-no-intent-update-from-agent

Yes-intent flow

  1. Message ingestion determines the yes-intent gate is open
  2. Runner evaluates the conversation through the yes-intent path
  3. ai-yes-intent-update-from-agent promotes lead/opportunity/thread state when detection is positive

Response flow

  1. Caller posts to ai-runner /response
  2. Runner assembles thread, lead, AI-convo, and system context
  3. Runner forwards to ai-agent /response
  4. Response is returned to the caller rather than written as an autonomous outbound send by the Worker itself

Supabase Coupling

Cloudflare AI depends on Supabase for: - thread, lead, vehicle, address, and opportunity context - AI conversation thread persistence - downstream update callbacks - global settings gates used by message and admin workflows

UX Coupling

  • Admin UI exposes prompt and toggle management in /admin
  • Message Center consumes the outputs indirectly through thread state, summaries, extraction, and AI review workflows
  • the AI assistant modal/provider is a separate in-app assistant surface, but the main production workflow automation currently centers on Message Center and admin-managed prompts

Assessment

Strengths

  • clear split between orchestration and model execution
  • prompt publishing path exists and is operator-accessible
  • AI updates are system-integrated instead of living only in frontend state

Constraints

  • behavior is distributed across Cloudflare Workers, Supabase callbacks, and message ingestion logic, so reasoning about end-to-end flow requires reading multiple runtimes
  • success/failure semantics vary by route: some are synchronous decision endpoints, others are async extraction/orchestration paths