00TL;DR
There are two prompt systems. The Assistant (Vernis) gets a rich, XML-tagged, capability-derived prompt. Every other agent — including Text Reply — gets only persona + goals + skill bodies + a bare tool-name list.
Operating rules that should be runtime-owned (how to deliver, how to read events, how to ground) are today either absent or smuggled into the operator-editable persona — where they can drift or be deleted.
The data model already anticipates layering: the prompt-capture struct has empty platform_rules / org_rules / agent_rules / learning slots. The scaffold exists; it's just unwired.
The DELEGATED_SUBAGENT_BLOCK is already the exact pattern the send-mandate needs: a runtime-injected, conditional block that says "plain replies aren't delivered — to act you must call tool X."
Introduce a platform core block + a capability/delivery block + per-event source framing. These satisfy #1476's acceptance criteria and lay the foundation for the rest of the model.
"Mandatory system-wide skills" = a new applicability scope on skills (platform / org / capability-matched), injected regardless of per-agent links — rendered as an operating-policy block, not mixed with opt-in skills.
01How prompts are built today
One generic agent turn assembles a single system-prompt string, then passes the user/contact input as a separate message role. Here is the literal shape produced by build_system_prompt().
Key properties of the current generic path:
- Persona + goals are interpolated verbatim and are operator-editable. That is correct for voice/brand content, but today it is also where delivery mechanics live (“Use only the tools available to you this turn…”) — runtime concerns trapped in editable prose.
- The tool block is just names. The model is told which tools exist by name; rig forwards the real
{name, description, input_schema}separately. There is no narrative of how / when to use them, and crucially no statement that calling a tool is the only way to act. - Untrusted input is correctly isolated from the system prompt (passed as
Message::user) — good injection hygiene — but the two user sources are flattened to identical bare text. - The capture struct already has the slots for a layered prompt — they're just empty:
// ai_thread_log_entry_type.rs — filled with String::new() today
pub struct AiThreadLogEffectivePromptCapture {
platform_rules: String, // ← Phase 3 placeholder (empty)
org_rules: String, // ← Phase 3 placeholder (empty)
agent_rules: String, // ← Phase 3 placeholder (empty)
persona: String, // ← the only operator content wired
goals: String,
learning: String, // ← Phase 8 placeholder (empty)
pending_lessons: Vec<String>,
skills: Vec<String>,
invokable_agents: Vec<String>,
memory_snapshot: Option<String>, // on-demand via tool
delegated_context: String, // ← the one wired runtime block
thread_context: String,
}
02The two-track problem
The Assistant already has the layered, capability-aware prompt we want every agent to have. The generic agents don't. The gap between them is the work.
Generic agent (Text Reply & all others)
Flat string, ~1 runtime block:
persona(operator)goals(operator)- delegated block (if sub-agent)
- skill bodies (opt-in)
- invokable-agent roster
- tool names only
No identity-of-runtime, no rules, no capability narrative, no delivery contract, no grounding directive, no response-format guidance.
Assistant / Vernis exemplar
XML-tagged, capability-derived sections:
<identity>— static, cacheable<session>— user, role, date<member_context>— personalization<capabilities>— derived from actual tools (read/write)<rules>— boundaries + 2-step confirm<workflows>— per-tool multi-step<response_format>·<error_handling>·<page_context>
Conditional on the live tool set; static-first ordering for KV-cache; untrusted fields sanitized.
The lesson the Assistant already teaches us
The good prompt is composed from sections the runtime owns, conditioned on the agent's real capabilities, and keeps operator-authored content as one section among many. We don't need to invent the pattern — we need to generalize the Vernis pattern down to a lightweight core every agent gets, and stop relying on persona prose to carry runtime rules.
03Prompting best practices we should follow
Drawn from Anthropic's Claude 4.x guidance and the project's own add-rig-tool standard. These are the principles the layered model is designed to satisfy.
| Principle | What it means for agent prompts |
|---|---|
| Role & identity first | Open with who the runtime is and the frame it operates in. Claude 4.x anchors strongly on a clear role. Today only the operator persona does this; the runtime adds nothing. |
| Be explicit & literal | 4.x follows instructions literally and under-acts when something is left implicit. "To reply, you must call send_sms" must be stated — the model will not infer that plain prose is undelivered. (This is gap #1.) |
| Structure with delimiters | XML/markdown sections (<rules>, ## …) make boundaries legible and let the model address a section by name. The Vernis prompt already does this; the generic path should too. |
| Lead with the trigger | "When X, do Y" outperforms "Y." From add-rig-tool: trigger-first descriptions give measurable lift on Sonnet/Opus 4.x. Applies to operating rules, not just tool docs. |
| Don't shout | CRITICAL: / YOU MUST / ALWAYS over-trigger and degrade instruction-following on 4.6+. Plain conditional guidance is correct. (The current persona is already well-behaved here — keep it that way.) |
| Separate trusted vs untrusted | Operator content is trusted and interpolated; the contact's message is untrusted and must never read as an instruction. Today's runtime isolates it into a user role (good) but doesn't label its source — so the model can't tell operator steering from a contact message. (Gap #2.) |
| Stable prefix for caching | Put static content first (identity, rules), volatile last (per-turn context). Deterministic tool order. Keeps the prompt-cache / KV-cache warm across a conversation — the Vernis prompt is explicitly built static-first. |
| Tools are the only way to act | An agent's effect on the world happens through tool calls. The prompt must make the action surface explicit and tell the model that narration ≠ action. The DELEGATED_SUBAGENT_BLOCK already says exactly this for sub-agents. |
| Ground before acting | Instruct the model to gather context via read tools before composing or acting, and to never fabricate. (Gap #3 — and the persona's "never invent facts" is the operator half of this.) |
| Capability-conditioned | Tell the model what it can do based on the tools actually present this turn — not a fixed list that may not match. Vernis derives <capabilities> from the live tool set; generic agents should derive their delivery instruction from send_mode + the real tool set. |
04The layered instruction model
Five tiers (plus a Phase-3 rules layer). Each has exactly one owner and one injection point. The rule for any new instruction: find its tier, and that tells you who writes it and where it goes.
Runtime operating contract new
Non-negotiable framing every (domain) agent gets, regardless of persona: who the runtime is, "plain text is reasoning — to act you call a tool," the operator≠contact contract, ground-before-acting, injection framing, safety/escalation defaults.
Policy guardrails Phase 3
Org-wide operating policy (org_rules) and per-agent hard guardrails (agent_rules) that sit above persona but below platform. Slots already exist in the capture struct; wiring is deferred to Phase 3.
Capability-derived delivery block new
Derived from the actual tool set + send_mode at turn time. "You can send → reply by calling send_sms (curried to the contact's line)." vs "Draft-only → produce options via propose_sms_replies." Mirrors Vernis <capabilities>. This is where the send-mandate belongs — not in editable persona.
Operator-authored persona + goals exists
The agent's voice, brand, domain, tone, escalation preferences, no-fabrication stance. Operator-editable. Text Reply's "warm, human, SMS-length, never invent facts, escalate on judgement" correctly lives here. We remove the delivery-mechanic lines that belong in Tier 1.
Modular instruction packs opt-in exists + mandatory new
Reusable, curated instruction bodies. Two attach modes: (a) per-agent opt-in via ai_agent_skill_link — exists today; (b) mandatory / system-wide — auto-applied to a class of agents with no link. See §6.
Source-framed turn context partly new
The volatile tail: history, reply context, on-demand memory, and — new — each event wrapped in a source envelope: [Incoming SMS from {contact} <{e164}>] … vs [Operator instruction — not the contact] …. Sanitized. This is where the operator/contact distinction becomes actionable (gap #2). Passed as message roles, never the system prompt.
Assembled order (proposed)
05Where each instruction lives
The decision table. For each kind of instruction: its tier, who owns it, how it's injected, and whether the operator can edit it. This is the direct answer to "what goes in the core vs the agent vs a skill."
| Instruction | Tier | Owner | Injected by | Editable |
|---|---|---|---|---|
| "You are an agent acting on the business's behalf" | 0 | Platform | core block | No |
| "Plain text is reasoning; to act, call a tool" | 0 | Platform | core block | No |
"[Operator …] = steering, not the audience" | 0 | Platform | core block | No |
| "Ground via read tools before acting; never fabricate" | 0 | Platform | core block | No |
| Org policy ("never quote prices over SMS") | 0.5 | Org admin | org_rules P3 | Admin |
"To reply, call send_sms (curried to their line)" | 1 | Runtime | capability block · send_mode=autonomous | No |
"Draft 2–3 options via propose_sms_replies" | 1 | Runtime | capability block · send_mode=suggest | No |
| "Warm, human, SMS-length, in the contact's language" | 2 | Operator | persona | Yes |
| "Escalate complaints / human requests / uncertainty" | 2 | Operator | persona/goals | Yes |
| Messaging compliance (opt-out, quiet hours) | 3·mand | Platform/Org | mandatory skill | No / admin |
| Objection-handling playbook | 3·opt | Operator | per-agent skill link | Yes |
| Per-event source envelope | 4 | Runtime | event framer | No |
The litmus test
If an instruction would be true for every agent of this kind no matter who configured it → it's a runtime concern (Tier 0/1/4), not persona. If it expresses this agent's voice, brand, or judgement preferences → persona (Tier 2). If it's a reusable pack of domain guidance → a skill (Tier 3). Today's bug is that Tier-0/1 rules are missing or hiding inside Tier-2 persona.
06Mandatory & system-wide skills
Skills today are purely opt-in: a skill reaches an agent only through an enabled ai_agent_skill_link row. There is no way to say "this guidance applies to all messaging agents, no opt-in." That's the missing piece you flagged.
Current mechanics
- A skill =
{title, short_description, body_markdown, organization_id, is_active}. Platform skills haveorganization_id = NULL. - Attach = a row in
ai_agent_skill_link(agent_id, skill_id, is_enabled). The loader selects links whereis_enabled = true, then loads active skills with an org-scope guard, and concatenates their bodies under## {title}. - No notion of mandatory, applicability, or precedence. An unlinked skill is invisible to the agent.
Two ways to model "mandatory system-wide"
Option A — model them as rules
Non-negotiable operating instructions are Tier 0/0.5 rules (platform_rules / org_rules), not skills. Skills stay strictly opt-in capability packs.
+ Clean separation; precedence is obvious; reuses Phase-3 slots.
− No curated authoring UX; can't reuse the skill library/tagging.
Option B — mandatory applicability on skills lean
Add an applicability scope to a skill: OptIn (today) · PlatformMandatory · OrgMandatory · capability-matched (e.g. "any agent with send_sms"). The loader unions matching mandatory skills with the agent's enabled links.
+ Reuses the whole skill authoring/curation surface; one concept.
− Needs precedence + a distinct render block so policy ≠ optional guidance.
Recommendation — a hybrid that respects your framing
Author mandatory instructions as skills (reuse the library + curation UX), but give a skill an applicability scope so it's injected without a per-agent link, and render mandatory skills in a distinct "Operating policy" block placed in Tier 0.5 — above persona and above opt-in skills — so precedence is unambiguous. Applicability can be as simple as a scope enum now, extensible to a capability/tag predicate later. Org-scoping still applies (a platform-mandatory skill reaches every org; an org-mandatory skill only that org's agents).
Out of scope for #1476
Mandatory-skill mechanics are a schema + loader change that the Text Reply fixes do not require. This doc names the design so the core block we add now doesn't have to be unwound later — but the build is a follow-up (epic #1422), not part of this issue.
07The three #1476 gaps, mapped to the model
Each acceptance-criterion failure is a missing tier. The model tells us exactly where the fix goes.
-
GAP 1The agent never delivers a replyTier 1 + Tier 0
In autonomous mode the model writes prose it believes is sent; the runtime only records it. The only delivery path is an explicit
send_smscall, and nothing tells the model that. Fix: a capability-derived delivery block (Tier 1) that states the send-mandate whensend_mode=autonomous, reinforced by the Tier-0 "plain text ≠ delivery" contract. This is the exact shape of the existingDELEGATED_SUBAGENT_BLOCK— reuse that pattern. -
GAP 2Can't tell operator from contactTier 0 + Tier 4
as_text()returns the bare body for bothInboundSmsandUserMessage, so operator steering and contact messages are indistinguishable — and a contact message could read as an instruction (injection risk). Fix: a per-event source envelope (Tier 4) —[Incoming SMS from {contact} <{e164}>]vs[Operator instruction — not the contact], with the contact body sanitized — plus a Tier-0 rule that explains how to read the two sources. -
GAP 3Replies with no groundingTier 0
The agent answers without consulting the contact's profile, history, notes, memory, or knowledge. Fix: a Tier-0 grounding directive — "before composing a reply or taking an action, gather the contact's context with your available read tools" — pairing with the operator-authored "never invent facts" already in persona.
08Concrete changes
The minimal-but-foundational build: everything #1476 needs, shaped so it instantiates the model rather than bolting on one-offs. Mapped to the issue's File Changes.
-
C1Platform core block (Tier 0)build_system_prompt_service.rs
Add a constant core block (sibling to
DELEGATED_SUBAGENT_BLOCK) injected for domain agents: identity-of-runtime, "plain text ≠ delivery / act via tools", the operator≠contact contract, the grounding directive. Populate theplatform_rulescapture field instead ofString::new(). Gate onattach_domain_toolsso tool-less agents stay lean. -
C2Capability / delivery block (Tier 1)run_ai_agent_thread_service.rs / build_system_prompt
Derive the delivery instruction from the resolved
send_mode+ the actual reply context: autonomous → "reply viasend_sms(sends from the contact's line)"; suggest → "draft options viapropose_sms_replies." Move this out of the Text Reply persona so the instruction always matches the live tool set. Decide: assemble in the executor (has the tool set) vs pass a flag intobuild_system_prompt. -
C3Per-event source framing (Tier 4)ai_thread_event_payload_type.rs + executor
Add a framed renderer (e.g.
as_framed_text()or a builder-side wrapper) that prefixes each event with its source and sanitizes the contact body via the existingsanitize_for_prompt/strip_prompt_newlineshelpers. Use it where the turn prompt is joined (:269). Keepas_text()for non-prompt uses or migrate call sites deliberately. -
C4Trim the Text Reply persona (Tier 2)new seed migration
Remove the delivery-mechanic lines ("Use only the tools available to you this turn…", the send-vs-draft branch) now carried by C1/C2; keep voice/brand/escalation/no-fabrication. Migrations are immutable — this is a new migration that updates the persona/goals of the base agent (and decides whether to re-sync existing unedited clones), not an edit to
m20260611_130003. -
C5Non-regression guardstests
Suggest mode still drafts via
propose_sms_replies; ownerless/sessionless agents still degrade to no domain tools (and so get no Tier-1 send block); delegated sub-agents keep their block. Add prompt-assembly unit tests mirroring the existing ones inbuild_system_prompt_service.rs.
Deferred to follow-ups (named, not built)
Tier 0.5 org_rules/agent_rules wiring (Phase 3) · mandatory/system-wide skills (§6) · Tier-3 render-ordering of policy-vs-opt-in skills · generalizing the Vernis section pattern (<rules>/<capabilities>) to all agents. The C1 core block should be written so these slot in above/around it without rework.
09Open decisions
The choices I'd like to lock before writing code. My recommendation is pre-selected; push back on any of them.
- Minimal-but-foundational: C1–C5 only (core block + capability block + event framing + persona trim). Satisfies #1476; lays the foundation. Defer mandatory-skills & org/agent rules.
- Full layered stand-up: also wire Tier 0.5 rules and mandatory skills now (much larger; pulls in schema migrations + Phase 3).
- Bare-minimum patch: send-mandate + framing as plain additions to the Text Reply persona, no core block (fast, but re-creates the "rules trapped in persona" problem).
- Runtime capability block (Tier 1), derived from
send_mode+ tool set. Always matches reality; can't be edited away. - Operator persona prose (Tier 2). Simplest diff, but it's exactly the fragility that caused gap #1.
- Hybrid (§6): author as skills with an applicability scope, render as an Operating-policy block in Tier 0.5. Reuses curation UX, keeps precedence clean. (Build later.)
- Pure rules: model them as
platform_rules/org_rulesonly; skills stay opt-in. Cleaner, but no shared authoring surface. - Decide later: ship #1476 with just the platform core block; revisit when the first real mandatory skill exists.
- Re-sync only
is_unedited_clone = truerows to the new persona; leave operator-edited clones untouched. - Leave all clones as-is; only the base agent changes (new members get the new persona; existing users keep the old prose until they reset).