A compositional fabric, not a stack.

The word every legal-tech vendor uses is stack. The architecture is allegedly stacked — the foundation model at the bottom, the application logic on top, a thin user interface above that. The stack metaphor is comforting because it implies a building. Buildings are stable. Buildings are inspectable. Buildings have known load-bearing elements that the engineer can point at.

The metaphor is wrong, and the wrongness is consequential. The architecture underneath a serious legal-AI product is not a stack. It is a fabric — a set of cooperating reasoning models that compose dynamically per request, with no single load-bearing element, no fixed dependency chain, and no orientation that survives the next provider change.

What a fabric is, structurally

The shape we ship inside Eve-Legal F5/reasoner is five cooperating reasoning models. A Microsoft Phi-3 classifier routes inbound requests. A Microsoft Phi-4-derived Small Reasoning Model — fine-tuned on the Eve-Genesis (Law Edition) synthetic corpus — carries the domain-specific reasoning load at near-Phi-4 cost and latency. One to three commercially available frontier models from major laboratories compose dynamically into the loop when the matter justifies the spend. A Meta Llama 4 Scout occupies the long-context slot for 10M-token single-context reasoning across an entire case file.

Five is a representative configuration, not a fixed schema. The exact composition varies per request: a routine intake conversation routes through the classifier and the legal reasoner alone; a Daubert pre-read against a 40,000-page record set activates the long-context slot and the frontier reasoning slots cross-validate. The number of active models is whatever the matter needs.

The properties that follow from "fabric"

The fabric framing earns its keep in four places.

Provider diversity. When a client or jurisdiction prohibits a specific provider — for procurement reasons, for data-residency reasons, for political reasons — that provider is swapped without rebuilding the agent. In a stacked architecture, "swap the foundation model" is a rebuild. In a fabric, it is a configuration change.

Cost calibration. The cost-per-request shape is fundamentally different. The legal reasoner runs at SRM cost and latency for the majority of inbound work. Frontier slots are activated only when the matter justifies the expense. The architecture lets a firm’s economics scale with matter complexity rather than with matter count.

Failure isolation. When a single provider has an outage — and providers have outages; that is not a hypothetical — the fabric routes around it. The classifier sees the failure, the supervisor reroutes the workload to the remaining composable slots, and the matter continues. In a stacked architecture, the provider’s outage is your outage.

Frontier delta capture. When the next-best frontier model ships, the fabric can incorporate it as a new composable slot with no rewriting of the agent. The improvement in raw model capability flows into the product without a migration.

The vendor’s incentive to call it a stack

The reason most legal-tech vendors talk about "the stack" is straightforward: their architecture is a stack. A foundation model at the bottom — usually one specific provider, often locked in for two years of cost commitments — with application logic plumbing the model into a chat interface. The stack is the entire architecture. The vendor’s commercial relationship with the provider is the bottom of the stack. The vendor’s entire product depends on the provider.

This is fine when the provider is the right one and the model is the right one and the cost is the right one and the legal-residency posture is the right one. In any other condition, the customer is locked to the vendor’s provider choice. The architectural decision the vendor made two years ago is the customer’s architectural decision today.

A fabric does not have this problem because it does not have a bottom.

The diligence question

The procurement-grade version of "is your architecture a stack or a fabric" translates to four direct questions a buyer can ask any legal-AI vendor on the next call:

Which foundation model is your product running on today? Which models could it run on if the answer to the first question stopped being viable? What is the rebuild cost — in engineering time, in customer downtime, in re-attested compliance evidence — to swap providers? What is the per-request cost shape, and does it scale with matter complexity or with matter count?

The vendor whose answers are "we run on provider X, we could not realistically run on anything else, swapping would be a multi-quarter rebuild, and our cost shape is flat per matter" is selling a stack. That is a defensible commercial position. It is not the architecture that serves a litigation practice for the next ten years.

The vendor whose answers are "we compose Phi-3, Phi-4, two frontier slots, and a long-context model; the composition changes per request; providers can be removed from the loop with a configuration change; cost scales with complexity, not count" is selling a fabric. That is the architecture that lasts.