Cloudflare AI Specification¶
Overview¶
Cloudflare AI is split into two Workers:
- ai-runner: orchestration and context loading
- ai-agent: direct model execution and prompt management
This split is intentional. Runner loads context once and fans out work; agent owns prompt-backed model calls and some admin prompt APIs.
Worker Inventory¶
cloudflare/ai-runner¶
Purpose:
- orchestration entrypoint for extraction, no-intent, yes-intent, and response paths
- shared context loading from Supabase
- proxy/admin route pass-through to ai-agent
Key routes found in code:
- /extract
- /no-intent
- /yes-intent
- /response
- /response-health
- /admin/prompts/meta
- /admin/prompts/read
- /admin/prompts/update
cloudflare/ai-agent¶
Purpose: - direct prompt/model execution against Cloudflare AI - prompt KV read/write APIs - worker endpoints for extracted data, intent, and response generation
Key routes found in code:
- /summary
- /vehicle
- /lead
- /response
- /intent
- /no-intent
- /admin/prompts/meta
- /admin/prompts/read
- /admin/prompts/update
Prompt System¶
- Prompt text is loaded from the
SYSTEM_PROMPTSKV binding - Canonical prompt keys include:
prompt-convo-responderprompt-vehicle-dataprompt-lead-dataprompt-convo-intentprompt-convo-noprompt-convo-yesprompt-lead-evaluator- Supabase
admin-ai-settingsstores prompt drafts and Cloudflare KV stores the published runtime prompts AdminAiSettings.tsxis the actual admin UI for draft/save/publish behavior
Model¶
ai-agentcurrently targets@cf/meta/llama-4-scout-17b-16e-instruct
Main Flows¶
Extraction flow¶
- Caller sends
thread_idandlead_idtoai-runner /extract - Runner fetches context from Supabase
- Runner calls
ai-agent /vehicleand/leadin parallel - Runner then calls
ai-agent /summary - Results are written back through Supabase callback/update endpoints
No-intent flow¶
- A Supabase function or internal caller posts to
ai-runner /no-intent - Runner builds the recent conversation window
- Runner calls
ai-agent /no-intent - Supabase closeout behavior is applied by
ai-no-intent-update-from-agent
Yes-intent flow¶
- Message ingestion determines the yes-intent gate is open
- Runner evaluates the conversation through the yes-intent path
ai-yes-intent-update-from-agentpromotes lead/opportunity/thread state when detection is positive
Response flow¶
- Caller posts to
ai-runner /response - Runner assembles thread, lead, AI-convo, and system context
- Runner forwards to
ai-agent /response - Response is returned to the caller rather than written as an autonomous outbound send by the Worker itself
Supabase Coupling¶
Cloudflare AI depends on Supabase for: - thread, lead, vehicle, address, and opportunity context - AI conversation thread persistence - downstream update callbacks - global settings gates used by message and admin workflows
UX Coupling¶
- Admin UI exposes prompt and toggle management in
/admin - Message Center consumes the outputs indirectly through thread state, summaries, extraction, and AI review workflows
- the AI assistant modal/provider is a separate in-app assistant surface, but the main production workflow automation currently centers on Message Center and admin-managed prompts
Assessment¶
Strengths¶
- clear split between orchestration and model execution
- prompt publishing path exists and is operator-accessible
- AI updates are system-integrated instead of living only in frontend state
Constraints¶
- behavior is distributed across Cloudflare Workers, Supabase callbacks, and message ingestion logic, so reasoning about end-to-end flow requires reading multiple runtimes
- success/failure semantics vary by route: some are synchronous decision endpoints, others are async extraction/orchestration paths