Enterprise AI

McKinsey State of AI 2025: Why Agents Stall in Pilots

· 7 min read· SemanticOS Team

TL;DR: The McKinsey state of AI 2025 survey shows agentic scaling is real but narrow: 23 percent of organizations are scaling an AI agent in at least one function, and agents lead in IT and knowledge management. Yet nearly two-thirds of companies still have not scaled AI enterprise-wide, and only 39 percent report any EBIT impact. The pattern points to a missing layer. Agents work in pilots where data is curated, then stall in production because they cannot reach connected institutional knowledge.

Agents demo well. In a scoped pilot, with a clean data set and a narrow task, an AI agent looks ready for the whole company. Then it meets the real organization: a dozen tools, stale documents, and answers that live in three people’s heads. That gap between the pilot and the floor is the story of the McKinsey state of AI 2025 report, and it is where agentic scaling keeps breaking down.

This post pulls the relevant numbers from McKinsey’s latest global survey, explains why IT and knowledge management got there first, and makes the case that the bottleneck is knowledge infrastructure, not model quality.

What does the McKinsey state of AI 2025 survey actually show?

McKinsey’s 2025 global survey collected responses from 1,993 participants across 105 nations between late June and late July 2025 (McKinsey, 2025). A few findings frame everything else.

Use is now nearly universal, but shallow. Eighty-eight percent of respondents report regular AI use in at least one business function, up from 78 percent a year earlier (McKinsey, 2025). Scaling is the exception, not the rule: roughly one-third of organizations have begun to scale AI across the enterprise, which means nearly two-thirds have not (McKinsey, 2025).

The financial picture is sober. Only 39 percent of respondents attribute any EBIT impact to AI, and most of those say AI accounts for less than 5 percent of EBIT (McKinsey, 2025). Use is broad. Bottom-line impact is rare.

Agentic AI here means systems built on foundation models that can act in the real world, planning and executing multiple steps in a workflow rather than answering a single prompt. On that front, curiosity runs high: 62 percent of respondents say their organizations are at least experimenting with AI agents (McKinsey, 2025).

How far has agentic scaling really gone?

Far enough to matter, not far enough to be normal.

  • 23 percent of respondents say their organizations are scaling an agentic AI system somewhere in the enterprise (McKinsey, 2025).
  • 39 percent more say they have begun experimenting with agents (McKinsey, 2025).
  • Most of those who are scaling agents do so in only one or two functions, and in any single function no more than 10 percent of respondents report scaling agents (McKinsey, 2025).

So the headline 23 percent is thinner than it sounds. A company that runs an agent in its IT service desk counts, even if every other function is untouched. Agentic scaling, in practice, means one beachhead, not a transformed enterprise.

Why did IT and knowledge management get there first?

McKinsey found agent use is most common in IT and knowledge management, with use cases such as service-desk management in IT and deep research in knowledge management developing quickly (McKinsey, 2025). That is not a coincidence. These two functions share a trait that makes agents work: their core job is already retrieving and acting on existing knowledge.

An IT service-desk agent resets a password, checks a ticket history, or follows a runbook. A knowledge management agent reads across documents and returns a synthesized answer. Both succeed when the underlying knowledge is findable and trustworthy. Where that knowledge is connected, the agent has something to stand on.

McKinsey’s eight years of research add a telling detail: knowledge management was, for the first time, among the functions with the most reported AI use, alongside the long-standing leaders in IT and marketing and sales (McKinsey, 2025). The functions whose entire purpose is finding and reusing information are the ones agents took to first. That tells you what agents are actually hungry for.

Why do agents stall when they leave the pilot?

A pilot is a friendly environment. Someone hand-picked the documents, cleaned the data, and scoped the task. The agent looks reliable because the knowledge it needs is sitting right there, curated.

Production is hostile by comparison. The same agent now has to find an answer that might live in a wiki, a ticketing system, a CRM, a contract repository, and a Slack thread from eight months ago. None of those tools share context. The agent either retrieves from one silo and misses the rest, or it stitches together fragments and gets the answer wrong.

McKinsey’s risk data shows the consequence. Fifty-one percent of organizations using AI have seen at least one negative consequence, and nearly one-third report consequences from AI inaccuracy, the single most common problem (McKinsey, 2025). Inaccuracy is what you get when an agent reasons over incomplete or stale knowledge. The model is rarely the weak link. The retrieval is.

This reframes the scaling problem. The reason two-thirds of organizations have not scaled is not that the agents are too dumb. It is that the knowledge the agents need is fragmented across tools that were never built to talk to each other.

What separates the companies that scale from the ones that stall?

McKinsey is direct about this. The roughly 6 percent of respondents it labels AI high performers, those attributing 5 percent or more of EBIT to AI, share a few habits. They redesign workflows rather than bolt AI onto old ones, they scale faster, and they are nearly three times more likely than peers to report having fundamentally redesigned individual workflows, one of the strongest factors tied to real business impact (McKinsey, 2025).

Two of those habits are about knowledge directly. High performers are more likely to have defined processes for when model outputs need human validation to ensure accuracy (McKinsey, 2025). And McKinsey’s broader practice set, drawn from more than 200 at-scale AI transformations, names data and technology infrastructure among the six dimensions essential to capturing value (McKinsey, 2025). Companies that scale agents tend to have already done the unglamorous work of making knowledge accessible and verifiable.

A concrete example: Vantage Health’s service-desk agent

Vantage Health, a regional health insurer, ran a clean pilot of an IT service-desk agent. On a curated set of 200 help articles, the agent resolved routine access requests well, and the pilot got a green light to expand.

In production, the cracks showed. The agent could read the IT knowledge base, but provisioning rules for the claims system lived in a separate operations wiki, security exceptions sat in a ticketing tool, and the most current onboarding steps existed only in a manager’s saved email drafts. Asked to grant a new claims analyst the right access, the agent pulled an outdated rule from the one source it could see and provisioned the wrong permissions. The team paused the rollout. The agent had not gotten worse. It had simply run out of connected knowledge.

Vantage Health’s fix was not a better model. It was a connective layer. A knowledge graph links entities such as people, systems, documents, and policies so that a single query can traverse relationships across tools instead of reading one silo at a time. Paired with AI search, that layer is the kind of unified semantic infrastructure SemanticOS provides: an operational brain that lets both people and AI agents find and reason over current institutional knowledge wherever it lives. With provisioning rules, security exceptions, and onboarding steps connected in one graph, the service-desk agent could resolve a request against accurate, current knowledge, and the rollout moved past the pilot.

The lesson generalizes. The agent was never the bottleneck. The infrastructure underneath it was.

Key takeaways

  • The McKinsey state of AI 2025 survey shows agentic scaling is real but narrow: 23 percent scale an agent in at least one function, usually just one or two, while nearly two-thirds have not scaled AI at all (McKinsey, 2025).
  • Agents lead in IT and knowledge management because those functions already run on retrieving and acting on existing knowledge.
  • Pilots succeed on curated data; production fails when knowledge is fragmented across disconnected tools, which shows up as the inaccuracy reported by nearly a third of AI users.
  • Only 39 percent of organizations report any EBIT impact from AI, and high performers stand out for redesigning workflows and investing in data and knowledge infrastructure.
  • The missing piece in pilot-to-production is connective knowledge infrastructure, a knowledge graph plus AI search, that gives agents accurate, current institutional knowledge to reason over.

Frequently asked questions

What does the McKinsey state of AI 2025 survey say about agentic scaling?

The McKinsey state of AI 2025 survey found that 23 percent of organizations are scaling an agentic AI system in at least one business function, while 39 percent more are experimenting. Most that scale agents do so in only one or two functions, so adoption is real but not yet widespread.

Which business functions use AI agents the most?

McKinsey reports that AI agent use is most common in IT and knowledge management, with use cases such as service-desk management in IT and deep research in knowledge management developing quickly.

Why do most enterprise AI pilots fail to scale?

Nearly two-thirds of organizations in the McKinsey survey have not begun scaling AI across the enterprise, and only 39 percent report any EBIT impact. A common blocker is that agents lack reliable access to connected institutional knowledge once they leave a controlled pilot.

What is knowledge infrastructure for AI agents?

Knowledge infrastructure is the connective layer, often a knowledge graph plus AI search, that links entities across an organization's tools so AI agents can find and reason over accurate, current institutional knowledge instead of querying isolated systems.

How many organizations report enterprise-level financial impact from AI?

In the McKinsey state of AI 2025 survey, only 39 percent of respondents attribute any EBIT impact to AI, and most of those say less than 5 percent of EBIT is attributable to AI use.

Sources

Share

Put a semantic brain behind your stack

SemanticOS unifies your tools and team knowledge into one real-time semantic graph. Join the waitlist for early access.

Join the Waitlist

We'll notify you when access is available.

No spam, ever. Unsubscribe anytime.

Related reading