Benedict Evans’ 2025 AI Deck — What It Actually Means for Enterprises

AI won’t replace your company. It will remove steps between the 4–500 apps you already use.

Why Listen to Benedict Evans?

Evans argues that AI’s value comes from removing steps across existing enterprise workflows, not adding new apps. Companies that invest now in narrow, measurable automations will build the capabilities and data advantage others will lack later. Read the full report here.

Why Listen to Benedict Evans?

Evans argues that AI’s value comes from removing steps across existing enterprise workflows, not adding new apps. Companies that invest now in narrow, measurable automations will build the capabilities and data advantage others will lack later. Read the full report here.

Why Listen to Benedict Evans?

Evans argues that AI’s value comes from removing steps across existing enterprise workflows, not adding new apps. Companies that invest now in narrow, measurable automations will build the capabilities and data advantage others will lack later. Read the full report here.

Why Listen to Benedict Evans?

Evans argues that AI’s value comes from removing steps across existing enterprise workflows, not adding new apps. Companies that invest now in narrow, measurable automations will build the capabilities and data advantage others will lack later. Read the full report here.

New Platforms, Fewer Steps

New Platforms, Fewer Steps

Mainframes → PCs → SaaS exploded the number of apps. AI flips the question from “what new app do we build?” to “what step can we remove?”

For enterprises, that means:

  • Treat LLMs as cross-app operators
    Read → reason → validate → write → log, across tools.

  • Aim at swivel-chair work
    Any task that forces humans to copy/paste between CRM, ITSM, ERP, email, spreadsheets.

  • Own the interface, swap the brains
    Keep models interchangeable behind one orchestration layer you control.

  • Make knowledge a versioned asset
    Version your corpora, scope retrieval tightly, and require the system to either cite its sources or refuse.


    The “new platform” isn’t a chatbot. It’s a thin intelligence layer across your existing estate.

Invest Now (Even If You “Pre-Build”)

Invest Now (Even If You “Pre-Build”)

Evans’ implicit warning to large companies: waiting is also a strategy—just usually a bad one.

How to read this as a corporate policy:

  • Waiting taxes you twice
    You lose automation gains now and arrive late to the learning curve (guardrails, metrics, playbooks).

  • Early investment pre-builds capability
    Even if you expect another model wave in 12–24 months, you’re banking:

    • Real usage data and error patterns

    • Evaluation suites and safety rails

    • Internal talent that understands how to ship AI safely

Fund AI like strategy, not as a demo budget Treat AI as a multi-year capability bet (like CRM or cloud), not a stack of pilots chasing headlines.

Why Usage Still Lags

Why Usage Still Lags

Evans points out a puzzle: consumer and enterprise AI both show massive top-of-funnel curiosity, but shallow habit.

Why corporate rollouts stall:

  • Chat-first UX
    Great for a conference demo; weak for repeatable daily work. People don’t wake up wanting “chat”—they want “this ticket closed faster.”

  • Metric theater
    Tokens, prompts, and “active users” don’t move the board. CFOs care about cycle time, error rates, and cost per resolved task.

  • Shadow knowledge
    Dumping “all our docs” into retrieval leads to leaks, hallucinations, and compliance headaches.

  • Vendor glue risk
    Over-coupling to a single stack makes every model upgrade a breaking change project.

If you don’t design for habit and governance from day one, AI remains a sideshow.

Why Usage Still Lags

Evans points out a puzzle: consumer and enterprise AI both show massive top-of-funnel curiosity, but shallow habit.

Why corporate rollouts stall:

  • Chat-first UX
    Great for a conference demo; weak for repeatable daily work. People don’t wake up wanting “chat”—they want “this ticket closed faster.”

  • Metric theater
    Tokens, prompts, and “active users” don’t move the board. CFOs care about cycle time, error rates, and cost per resolved task.

  • Shadow knowledge
    Dumping “all our docs” into retrieval leads to leaks, hallucinations, and compliance headaches.

  • Vendor glue risk
    Over-coupling to a single stack makes every model upgrade a breaking change project.

If you don’t design for habit and governance from day one, AI remains a sideshow.

From Experiments to Daily Work

From Experiments to Daily Work

Principle: make the safe path the fast path.

1) Ship skinny, production-grade flows

  • Pick one narrow job.

  • Use deterministic retrieval and strict validators before write-back.

  • Log every query, response, decision, and override.

If confidence is low or the blast radius is high, the system should refuse and escalate with evidence.

2) Instrument like a growth team

Track metrics that matter to executives:

  • Time-to-first-answer

  • % of answers grounded in approved sources

  • Human edit rate

  • Deflection rate + CSAT for support

  • SLA adherence

  • Cost per resolved task

Publish weekly deltas. Kill or rework flows that don’t pay their way.

3) Assemble, don’t over-platform

Your stack should separate concerns:

  • Orchestration & policy (your code): tools, routing, approvals, rate limits.

  • Pluggable cognition: at least two external model providers + one local/edge path.

  • Authoritative context: versioned corpora, scoped retrieval, per-field provenance.

  • Validation: schema checks, reference matching, policy enforcement.

You’re building a control plane for intelligence, not another monolith.

4) Aim at high-leverage loops

Start where Evans’ “platform shift” logic bites hardest: repeating, information-heavy workflows with clear truth.

Examples:

  • Support
    Classify → retrieve → draft → validate entitlements → escalate with cited evidence.

  • CRM & sales ops
    Summarize calls, extract next steps, update fields, draft follow-ups for approval.

  • Finance operations
    Parse invoices, run 3-way match, flag variances with plain-language explanations.

  • Engineering
    PR summaries, test scaffolds, incident timelines, runbook suggestions.

Policy & compliance Cited answers, exception flags, full audit trails for every AI-assisted action.

From Experiments to Daily Work

Principle: make the safe path the fast path.

1) Ship skinny, production-grade flows

  • Pick one narrow job.

  • Use deterministic retrieval and strict validators before write-back.

  • Log every query, response, decision, and override.

If confidence is low or the blast radius is high, the system should refuse and escalate with evidence.

2) Instrument like a growth team

Track metrics that matter to executives:

  • Time-to-first-answer

  • % of answers grounded in approved sources

  • Human edit rate

  • Deflection rate + CSAT for support

  • SLA adherence

  • Cost per resolved task

Publish weekly deltas. Kill or rework flows that don’t pay their way.

3) Assemble, don’t over-platform

Your stack should separate concerns:

  • Orchestration & policy (your code): tools, routing, approvals, rate limits.

  • Pluggable cognition: at least two external model providers + one local/edge path.

  • Authoritative context: versioned corpora, scoped retrieval, per-field provenance.

  • Validation: schema checks, reference matching, policy enforcement.

You’re building a control plane for intelligence, not another monolith.

4) Aim at high-leverage loops

Start where Evans’ “platform shift” logic bites hardest: repeating, information-heavy workflows with clear truth.

Examples:

  • Support
    Classify → retrieve → draft → validate entitlements → escalate with cited evidence.

  • CRM & sales ops
    Summarize calls, extract next steps, update fields, draft follow-ups for approval.

  • Finance operations
    Parse invoices, run 3-way match, flag variances with plain-language explanations.

  • Engineering
    PR summaries, test scaffolds, incident timelines, runbook suggestions.

Policy & compliance Cited answers, exception flags, full audit trails for every AI-assisted action.

What to Build in 90 Days

What to Build in 90 Days

You don’t need a “five-year AI vision deck.” You need one credible 90-day plan.

Weeks 1–2: Choose and baseline

  • Pick one revenue surface and one cost surface with:

    • Repeatable work

    • Clear definition of “done”

    • Strong ground truth (systems of record)

  • Baseline:

    • Minutes per task

    • Error cost

    • Backlog and SLA performance

Weeks 3–6: Ship the skinny version

  • Constrain sources aggressively; add validators before anything touches core systems.

  • Roll to 10–50 real users.

  • Capture human edits as training and evaluation signals.

Weeks 7–10: Stabilize and prove ROI

  • Cut variance; tighten retrieval scopes; cap context window size.

  • Cache frequent queries.

  • Share weekly cycle-time, accuracy, and error reductions with leadership.

Weeks 11–13: Productize

  • Replace chat with one-click actions inside existing tools (ticket sidebar, CRM panel, email plug-in).

  • Add admin controls for scopes, thresholds, approvals, and audits.

  • Run your first controlled model upgrade and document the impact end-to-end.

The Takeaway

The Takeaway

Evans’ decks are a pattern: he shows that platforms win by changing how work gets done, not by adding more screens.

Translate that to AI:

  • LLMs are workflow compressors, not another department.

  • Use them to remove steps across your SaaS estate—not to create a new island.

  • Measure behavior change and business outcomes, not tokens.

  • Quietly stack “boring” advantages: scoped knowledge, evals, guardrails, and model optionality.

The companies that start now won’t just have better tooling in three years. They’ll have better workflows, cleaner data, and an organisation that already knows how to ship AI safely—while everyone else is still rewriting their first pilot.

References

https://www.ben-evans.com/presentations

Share on LinkedIn
Share on LinkedIn
Share on LinkedIn
Share on LinkedIn

All Articles

All

All Articles

All

All Articles

All

All Articles

All

All Articles

All