# Sample AI Diagnose deliverable

A redacted sample of the written deliverable you receive from a Density Labs AI Diagnose: data-readiness, AI-fit, and integration-risk scoring plus a prioritized AI roadmap.

<link rel="stylesheet" href="?v=">

<style>
  /* Scoped to the sample-deliverable page. */
  .gs-doc { padding: 72px 0 96px; background: var(--paper); }
  .gs-doc .container { max-width: 820px; }
  .gs-redaction-note {
    display: flex; gap: 12px; align-items: flex-start;
    background: var(--paper-soft, #F2F0EC);
    border-left: 3px solid var(--red, #991B1B);
    padding: 16px 18px; border-radius: 4px;
    font-size: 14px; line-height: 1.6; color: var(--ink-soft);
    margin-bottom: 48px;
  }
  .gs-redaction-note strong { color: var(--ink); font-weight: 600; }
  .gs-doc section { margin-bottom: 48px; }
  .gs-doc h2 {
    font-family: var(--font-display); font-weight: 700;
    font-size: clamp(22px, 2.6vw, 28px); letter-spacing: -0.01em;
    color: var(--ink); margin: 0 0 18px;
    padding-bottom: 10px; border-bottom: 2px solid var(--line);
  }
  .gs-doc h2 .num { color: var(--red); font-family: var(--font-mono); font-size: 0.6em; margin-right: 10px; vertical-align: middle; }
  .gs-doc h3 { font-family: var(--font-display); font-weight: 600; font-size: 17px; color: var(--ink); margin: 24px 0 10px; }
  .gs-doc p, .gs-doc li { font-size: 15px; line-height: 1.65; color: var(--ink-soft); }
  .gs-doc strong { color: var(--ink); font-weight: 600; }
  .gs-doc ul { padding-left: 22px; }
  .gs-doc li { margin-bottom: 8px; }
  .gs-doc table { width: 100%; border-collapse: collapse; margin: 18px 0; font-size: 14px; }
  .gs-doc th, .gs-doc td { text-align: left; padding: 10px 12px; border-bottom: 1px solid var(--line); vertical-align: top; }
  .gs-doc th { font-family: var(--font-mono); font-size: 11px; letter-spacing: 0.08em; text-transform: uppercase; color: var(--ink-muted); }
  .gs-doc td.c, .gs-doc th.c { text-align: center; }
  .gs-scorecard {
    font-family: var(--font-mono); font-size: 14px; line-height: 2.1;
    background: var(--ink, #141414); color: var(--paper, #FAFAF8);
    padding: 22px 24px; border-radius: 8px; overflow-x: auto; white-space: pre;
  }
  .gs-bar { color: var(--red); }
  .gs-callout {
    background: var(--paper-soft, #F2F0EC); border-left: 3px solid var(--red);
    padding: 16px 18px; border-radius: 4px; font-size: 15px; line-height: 1.6;
    color: var(--ink); margin: 18px 0;
  }
  .gs-keep { font-style: italic; color: var(--ink-muted); font-size: 14px; margin-top: 6px; }
  .gs-doc-cta {
    margin-top: 56px; padding: 36px; background: var(--ink, #141414);
    border-radius: 12px; text-align: center; color: var(--paper);
  }
  .gs-doc-cta h2 { color: var(--paper); border: none; padding: 0; }
  .gs-doc-cta p { color: rgba(250,250,248,0.8); max-width: 560px; margin: 0 auto 22px; }
  .gs-doc-cta .btn {
    display: inline-block; background: var(--red); color: #fff; text-decoration: none;
    font-family: var(--font-display); font-weight: 600; font-size: 15px;
    padding: 15px 30px; border-radius: 8px; transition: background 0.15s, transform 0.1s;
  }
  .gs-doc-cta .btn:hover { background: var(--red-deep); transform: translateY(-1px); }
</style>

<!-- HERO -->
<section class="gs-hero">
  <div class="container">
    <div class="gs-breadcrumb">
      <a href="/">Density Labs</a> <span>/</span> <a href="/get-started/diagnostic/">AI Diagnose</a> <span>/</span> Sample deliverable
    </div>
    <div class="gs-eyebrow">Redacted sample // What you receive</div>
    <h1>A real diagnose, <em>redacted</em>.</h1>
    <p class="lede">This is the kind of written deliverable you keep at the end of a two-week AI Diagnose: three pillars scored against evidence, a prioritized roadmap, and an honest read on whether AI is even the right next move. The client below is <strong>fictional</strong> — names, systems, and figures generalized — but the structure and reasoning mirror a real engagement.</p>
    <div class="gs-meta">
      <span><strong>Workflow</strong>Support ticket triage</span>
      <span><strong>Window</strong>2 weeks</span>
      <span><strong>Verdict</strong>Go, with conditions</span>
    </div>
  </div>
</section>

<!-- DELIVERABLE BODY -->
<div class="gs-doc">
  <div class="container">

    <div class="gs-redaction-note">
      <span class="gs-confidential-icon">🔒</span>
      <span><strong>This is a redacted sample.</strong> Real Density Labs Diagnose deliverables are confidential to the client and yours to keep. "Northwind Logistics" is a composite, fictional mid-market company used here to show the shape of the work without exposing any client.</span>
    </div>

    <section>
      <h2><span class="num">01</span>Executive summary</h2>
      <p><strong>The question we were hired to answer:</strong> Can we put an LLM-assisted step into our support workflow this quarter to cut first-response time, without creating a data mess or a reliability problem we can't own?</p>
      <p><strong>Our verdict: Go, with conditions.</strong> There is a real, well-shaped AI win here — but it lives in <em>draft-assist with a human in the loop</em>, not in full auto-resolution. Ship the assist; do not ship autonomy yet.</p>
      <h3>Readiness at a glance</h3>
      <table>
        <thead><tr><th>Pillar</th><th class="c">Score (1–5)</th><th>One-line read</th></tr></thead>
        <tbody>
          <tr><td>Data readiness</td><td class="c">3</td><td>6 years of resolved tickets exist, but macros and free-text are tangled and lightly tagged.</td></tr>
          <tr><td>AI-fit</td><td class="c">4</td><td>Drafting tier-1 replies from a known knowledge base is a strong, proven LLM use case.</td></tr>
          <tr><td>Integration risk</td><td class="c">3</td><td>One core system (the helpdesk) plus SSO; manageable, but fallback and ownership must be designed in.</td></tr>
        </tbody>
      </table>
      <h3>The three things that matter most</h3>
      <ul>
        <li><strong>The value is in deflecting first-response <em>time</em>, not headcount.</strong> Agents spend most tier-1 handling time composing replies they've effectively written hundreds of times. That is the dollar.</li>
        <li><strong>Your historical tickets are a real asset, but need light cleanup.</strong> Resolution notes are inconsistent and only ~40% are tagged. Usable, not turnkey.</li>
        <li><strong>Autonomy is the trap.</strong> A fully automated responder would fail publicly on the long tail and burn trust. The safe first step keeps the agent in control and just makes them faster.</li>
      </ul>
      <div class="gs-callout"><strong>What we would build first, and why:</strong> A retrieval-assisted draft-reply panel inside the existing helpdesk for tier-1 ticket categories — it attacks the largest, best-understood slice of volume with a human approving every send.</div>
    </section>

    <section>
      <h2><span class="num">02</span>Scope &amp; method</h2>
      <p><strong>The one workflow we assessed:</strong> When a ticket arrives, an agent reads it, classifies it, finds the relevant policy or past resolution, and writes a reply. Tier-1 tickets (billing questions, shipment status, account changes) are high-volume and repetitive. Today every reply is composed by hand, sometimes from saved macros that are out of date.</p>
      <p><strong>Why this workflow:</strong> High-volume, repetitive, low-variance, and already measured (first-response time and CSAT). The cleanest place to prove value and the easiest to instrument.</p>
      <p><strong>What we explicitly did <em>not</em> assess:</strong> Tier-2/escalation handling, phone support, the billing system itself, or org-wide data-platform questions. One workflow.</p>
      <h3>How we ran the two weeks (the Density Method, applied)</h3>
      <ul>
        <li><strong>Days 1–3 — Context intake.</strong> Interviews with the VP of Customer Operations, a senior support agent, and the helpdesk system owner. Walkthrough of the helpdesk and a sample of resolved tickets.</li>
        <li><strong>Days 4–7 — Assessment.</strong> Data-readiness audit, AI-fit analysis, and integration-risk review of the workflow.</li>
        <li><strong>Days 8–10 — Roadmap &amp; synthesis.</strong> Prioritization, sequencing, and the recommended first build.</li>
      </ul>
      <p><strong>Evidence base:</strong> 3 interviews, 1 helpdesk walkthrough, ~6 months of anonymized resolved-ticket samples, and the current macro library.</p>
    </section>

    <section>
      <h2><span class="num">03</span>Pillar 1 — Data readiness</h2>
      <p><strong>Score: 3 / 5 — Workable with conditions</strong></p>
      <ul>
        <li><strong>Availability &amp; access:</strong> Six years of resolved tickets, exportable via the helpdesk API. Not a blocker.</li>
        <li><strong>Quality &amp; consistency:</strong> Resolution notes are free-text and inconsistent; the macro library is stale.</li>
        <li><strong>Coverage &amp; volume:</strong> More than enough volume on tier-1 categories to ground retrieval. The long tail is thin — out of scope for v1.</li>
        <li><strong>Governance, PII &amp; compliance:</strong> Tickets contain customer PII. Manageable, but any AI step needs a PII-handling and retention answer before launch.</li>
        <li><strong>Labeling / ground truth:</strong> Only ~40% of historical tickets are tagged, and the taxonomy has drifted. The single biggest data gap.</li>
      </ul>
      <p><strong>The gap that matters:</strong> Inconsistent tagging and stale macros mean retrieval would sometimes surface the wrong precedent. Not fatal — but it must be addressed, not assumed away.</p>
      <p><strong>What it takes to close it:</strong> A focused cleanup pass on the top ~15 tier-1 categories: refresh the macros and back-tag a representative sample. Days, not months — and reusable beyond this project.</p>
    </section>

    <section>
      <h2><span class="num">04</span>Pillar 2 — AI-fit</h2>
      <p><strong>Score: 4 / 5 — Good fit</strong></p>
      <p><strong>Is this a real AI problem?</strong> Yes, for drafting and retrieval. Classification of tier-1 categories is partly a rules/lookup problem and shouldn't be over-modeled.</p>
      <p><strong>Best-fit approach:</strong> Retrieval-augmented generation — pull the relevant policy and closest past resolutions, then have an LLM draft a reply the agent edits and sends. Deterministic guardrails for anything touching money or account changes.</p>
      <p><strong>Why this over the alternatives:</strong> A fine-tuned model is unnecessary and expensive here; pure macros can't handle phrasing variation; full automation is too risky for the long tail. RAG-with-human-approval is the proven middle path and fastest to value.</p>
      <p><strong>Value if it works:</strong> A meaningful cut in first-response time on tier-1 volume by removing blank-page composition — a gain customer ops already knows how to measure.</p>
      <p><strong>Where it will struggle:</strong> Novel or emotionally charged tickets, judgment calls, and edge cases where the retrieved precedent is subtly wrong. The design must make these easy for the agent to catch — which is why the human stays in the loop in v1.</p>
    </section>

    <section>
      <h2><span class="num">05</span>Pillar 3 — Integration risk</h2>
      <p><strong>Score: 3 / 5 — Conditional</strong></p>
      <ul>
        <li><strong>Surface area:</strong> Primarily one system — the helpdesk — via its API, plus SSO. Modest and well-understood.</li>
        <li><strong>Reliability &amp; latency budget:</strong> Drafting isn't on the customer's critical path; a couple of seconds for a draft is fine.</li>
        <li><strong>Human-in-the-loop / fallback:</strong> Strong by design — agents approve every send, and if the panel is down they work as they do today.</li>
        <li><strong>Ownership &amp; on-call:</strong> No obvious owner today for an AI feature inside support. Must be assigned before launch.</li>
        <li><strong>Security &amp; vendor exposure:</strong> Customer PII would pass to an LLM provider. Resolvable with a no-retention configuration and field redaction, but needs an explicit decision.</li>
      </ul>
      <p><strong>The risk that matters:</strong> Not the wiring — that's routine. It's <strong>ownership and the PII path.</strong> Both are decisions, not engineering problems, and both must close before a production launch.</p>
      <p><strong>How we'd de-risk it:</strong> Name an owner in customer ops, choose a no-retention LLM configuration, redact sensitive fields before they leave the tenant, and pilot on one category with a defined rollback before expanding.</p>
    </section>

    <section>
      <h2><span class="num">06</span>Readiness scorecard</h2>
      <div class="gs-scorecard">Data readiness    <span class="gs-bar">▓▓▓░░</span>  3/5   6 yrs of tickets, but tagging &amp; macros need cleanup
AI-fit            <span class="gs-bar">▓▓▓▓░</span>  4/5   RAG draft-assist is a proven, well-shaped use case
Integration risk  <span class="gs-bar">▓▓▓░░</span>  3/5   One system + SSO; ownership &amp; PII path must be decided</div>
      <p style="margin-top:18px;"><strong>Overall readiness: Ready with conditions</strong> — no pillar is below 3, and the two conditions (a focused data cleanup and two ownership/PII decisions) are nameable, fundable, and small relative to the value.</p>
    </section>

    <section>
      <h2><span class="num">07</span>Prioritized AI-initiative roadmap</h2>
      <table>
        <thead><tr><th>#</th><th>Initiative</th><th class="c">Impact</th><th class="c">Effort</th><th class="c">Sequence</th><th>Depends on</th></tr></thead>
        <tbody>
          <tr><td>1</td><td>RAG draft-reply panel for top tier-1 categories (human-approved)</td><td class="c">High</td><td class="c">Med</td><td class="c">Now</td><td>Macro/tag cleanup on top categories</td></tr>
          <tr><td>2</td><td>Auto-classification + smart routing of incoming tickets</td><td class="c">Med</td><td class="c">Med</td><td class="c">Next</td><td>#1 live; tagging baseline from #1</td></tr>
          <tr><td>3</td><td>Suggested knowledge-base updates from recurring tickets</td><td class="c">Med</td><td class="c">Low</td><td class="c">Later</td><td>#1 generating volume of edits</td></tr>
          <tr><td>4</td><td>Selective auto-send for narrow, low-risk categories</td><td class="c">High</td><td class="c">High</td><td class="c">Later</td><td>Proven accuracy + ownership from #1–#2</td></tr>
        </tbody>
      </table>
      <p><strong>Sequencing logic:</strong> #1 attacks the biggest, best-understood slice and produces the clean, tagged data that makes #2 cheap. #3 is a low-effort compounding win once agents are editing drafts. #4 (limited autonomy) is deliberately last — earn it with measured accuracy, don't bet on it up front.</p>
    </section>

    <section>
      <h2><span class="num">08</span>Recommended first build (the on-ramp)</h2>
      <p><strong>Build this first:</strong> RAG draft-reply panel for the top tier-1 categories.</p>
      <p><strong>What it is:</strong> A panel inside the existing helpdesk that, for a tier-1 ticket, retrieves the relevant policy and closest past resolutions and drafts a reply the agent edits and sends. Scoped to the highest-volume categories only.</p>
      <p><strong>Why first:</strong> Largest, most repetitive slice of volume; the value (first-response time) is already measured; the human-approval design keeps risk low while accuracy is proven.</p>
      <p><strong>What "done" looks like:</strong> One team using the panel on live tier-1 tickets, with a measured reduction in first-response time on the piloted categories and CSAT held flat or better.</p>
      <h3>How Density would deliver it</h3>
      <ul>
        <li><strong>Shape:</strong> One embedded <strong>AI Engineer</strong> is the right size — a single senior owning the build, the data cleanup, and the pilot. Scale to a small <strong>Squad</strong> only to pursue roadmap items #2 and #4 in parallel.</li>
        <li><strong>Indicative timeline:</strong> A focused build-and-pilot effort measured in weeks, not quarters.</li>
        <li><strong>What we need from you:</strong> Helpdesk API access, a named product owner, one pilot team, and a decision on the PII/vendor path.</li>
      </ul>
      <div class="gs-callout">Your AI Diagnose fee is credited toward a follow-on Engineer or Squad engagement started within 60 days.</div>
    </section>

    <section>
      <h2><span class="num">09</span>The honest read</h2>
      <p>The instinct in the room was to ask for a bot that answers tickets on its own. We'd advise against that as a first move — not because it's impossible, but because the long tail of support is exactly where an autonomous responder fails publicly, and a mid-market brand gets one chance to lose a customer's trust. The real, bankable win is quieter: take the blank page away from your agents on the tickets they've each answered a thousand times. That's a saving you can measure next month, with a risk profile your VP can sign off on. Earn autonomy later with data; don't buy it now with hope.</p>
      <p>If you do nothing else from this report: fund the small tagging-and-macro cleanup. It pays off whether or not you build the AI panel, and it's the cheapest insurance against a pilot that disappoints.</p>
    </section>

    <div class="gs-doc-cta">
      <h2>Want one of these for your workflow?</h2>
      <p>$2,500 fixed. Two weeks. A written AI roadmap you keep, regardless of what comes next.</p>
      <a class="btn" href="/get-started/diagnostic/#apply">Start your AI Diagnose →</a>
    </div>

  </div>
</div>
