The Pyramid Bundles Two Things That Are Not the Same
Consulting expertise and consulting labor are distinct economic goods. The pyramid has bundled them together for so long that most firms have stopped noticing the difference. AI is forcing the unbundling.
Part of the Phase II — Understanding series
By Michael E. Ruiz
The consulting pyramid treats expertise and production as if they are a single service delivered at different seniority levels. They are not. They are two distinct economic activities with different cost structures, different value drivers, and different relationships to the client outcome. The pyramid binds them together because the historical delivery model required it. Understanding what happens when that binding loosens is the key to understanding what is actually at stake.
The Bundling
A managing director structures a problem, interprets findings, and advises the client on what to do. An analyst researches the market, builds the model, and drafts the deliverable. In the traditional engagement model, both activities appear on the same statement of work, priced in hours, and delivered by the same team. The client experiences them as a single service.
But the economic characteristics of these two activities are fundamentally different. The managing director's work is high-variance, context-dependent, and tied to the specific client situation. It cannot be standardized or made repeatable. Its value is proportional to the consequence of the decision it supports — which means it should, in principle, be priced against the value at stake rather than the time consumed.
The analyst's work is lower-variance, more procedural, and increasingly substitutable. Market research follows recognizable patterns. Financial models are built from templates and assumptions that can be structured in advance. Synthesis documents follow a logic that can be specified. This work has always been essential to the engagement, but its economic character is closer to production than to judgment.
The pyramid bundles these together and prices them on a single axis: hours multiplied by rate. The result is a structural pricing distortion. Senior judgment is systematically underpriced relative to the value it creates, because the rate card does not reflect the consequence of the decision it supports. Junior production is systematically overpriced relative to its standalone value, because the hours carry the fixed costs of the labor pyramid that employs the analyst.
What Clients Are Actually Paying For
Clients engage consulting firms for outcomes: a strategy they can execute, a transformation roadmap they can operationalize, a risk assessment they can act on. The willingness to pay is anchored to the consequence of the decision — the size of the market being entered, the cost of the infrastructure being built, the magnitude of the risk being managed. A $200,000 strategy engagement supporting a $500 million capital decision is not expensive. It is appropriately proportioned to what is at stake.
But T&M billing does not connect the fee to the outcome. It connects the fee to the effort. The client pays for how many hours it took to produce the analysis, not for the quality or consequence of the recommendation. This is why clients push back on team size and utilization rather than on the value of the advice: the billing model directs their attention to the input, not the output.
Fixed-price engagements come closer to aligning with outcomes because the client pays for a deliverable rather than an effort level. Historically, this has been constrained to situations where production effort is predictable — assessments, audits, standard methodologies. The constraint was never conceptual. Firms understood that outcome-aligned pricing was preferable. The constraint was operational: production variability made it too risky to absorb delivery uncertainty at scale.
What Changes When Production Becomes Predictable
AI does not simply reduce the cost of production work. It reduces the variability.
When an AI system handles market research, the effort required does not fluctuate based on how many analysts are available, how steep the learning curve is on a new sector, or how many iterations the draft goes through before it reaches acceptable quality. The output is more consistent. The timeline is more predictable. The delivery risk that made T&M necessary for complex engagements begins to compress.
This is the mechanism that matters. The argument is not that AI makes consulting cheaper — although in many cases it does. The argument is that AI makes the production layer predictable enough to absorb as firm risk rather than passing it through to the client. That predictability is what makes fixed-price and outcome-aligned delivery models viable for a broader range of work than they have ever been.
AI does not just reduce cost. It reduces the uncertainty that made outcome-based pricing impractical for complex engagements.
The Economic Shift
When production effort compresses and becomes more predictable, the financial dynamics of each engagement change in ways that are not immediately obvious.
Revenue per project is likely to decline. If the production layer requires fewer hours, the T&M invoice shrinks. A project that previously generated $400,000 in fees across a team of eight over twelve weeks might generate $180,000 across a team of three over four weeks. On a revenue-per-engagement basis, this looks like contraction.
But margin per project increases. The expensive resource — senior expertise — is still applied. The cheaper resource — production labor — is substantially reduced. The ratio of value-creating work to production work shifts materially in favor of value. The firm captures more margin on a smaller total fee.
The implication is that the scaling model changes. The pyramid scales through headcount: more clients require more people, more people generate more revenue. When production compresses, scaling shifts from headcount to throughput — the number of engagements a given team of experts can support simultaneously. A managing director who previously ran two engagements in parallel can run four, because the production burden per engagement is lower. The firm grows through velocity, not mass.
Where Value Is Created
The traditional consulting firm creates competitive advantage through talent. The ability to recruit, develop, and retain high-quality people has been the primary moat for decades. The pyramid serves double duty: it is both a delivery mechanism and a talent development pipeline. Junior consultants enter at the base, learn through production work, and advance into advisory roles as they accumulate experience and judgment.
If the production layer compresses, that pipeline narrows. The entry point that has historically trained the next generation of partners — immersion in research, modeling, and client delivery at scale — becomes less available. The question of where future senior talent comes from is not academic. It is an operating challenge that firms will need to address directly.
At the same time, the sources of competitive advantage begin to diversify. Talent remains important, but it is joined by other factors: the quality of the firm's AI workflows, the sophistication of its delivery orchestration, the depth of its domain-specific knowledge bases, and its ability to price around outcomes rather than effort. The firms that create value in the next decade will not be the ones with the largest pyramids. They will be the ones that assemble the right combination of expertise, infrastructure, and delivery design.
The Talent Shift
The analyst layer in the traditional pyramid serves two functions simultaneously: it produces the work, and it trains the next generation. When production compresses, the training function does not automatically survive.
The development model shifts. Fewer people enter the firm at the base. Those who do enter need different capabilities — not research and slide production, but the ability to direct AI systems, validate outputs, and operate at the intersection of technology and domain expertise. The career path narrows at the bottom and widens in the middle: fewer generalists rotating through projects, more specialists building depth in specific domains over time.
This is not a loss. It is a structural change in how expertise is developed. But it requires firms to rethink how they recruit, what they train for, and what a career in consulting looks like when the apprenticeship model built on production work is no longer the default entry point.
The Inevitability
None of this requires prediction about AI capability timelines or speculation about which firms will adapt fastest. The logic is structural.
If production becomes more predictable and more separable from judgment, the delivery model built around bundled labor — priced by the hour, scaled through headcount, structured as a pyramid — cannot hold its current form indefinitely. The economic pressure is directional and compounding. The firms that recognize the unbundling early will have time to redesign. The ones that treat AI as a productivity tool layered onto the existing model will find themselves competing against firms that have already moved to a different structure entirely.
If production becomes predictable and separable, the structure built around bundled labor cannot hold.
These ideas are available as keynote presentations and executive briefings. Explore speaking topics →