In 2024 every consulting firm became an AI consultancy. By 2026, the market has filtered enough buyers that the question is no longer "do you do AI" — it is "have you ever shipped a production AI workload that your customer is still running, and can you name them". This piece is a working read on what an honest AI-led enterprise development engagement looks like across the three markets where most of our cross-border work happens — the UK, the EU, and the USA — and where buyers should push back hard on the standard consultancy playbook.
Why "AI consultancy" is the wrong frame
AI on its own does not deliver value. AI inside a working enterprise workflow does. The engagements that ship are not "AI projects" — they are enterprise software projects where AI is the implementation choice for one or more specific decisions inside a larger automated workflow. The teams that succeed are enterprise software teams that happen to use AI. The teams that fail are AI teams that have never shipped enterprise software.
This sounds like a semantic distinction. It is not. It is the dividing line between a 16-week engagement that goes live and a 16-month engagement that becomes a slide deck.
The shape of an honest 2026 engagement
A working AI-led enterprise development engagement runs in five fixed-fee phases, regardless of region. Discovery (3–5 weeks, fixed-fee) covers workshops with the operational, technical, finance, and compliance leads. The output is an SRS, a model-and-deployment architecture, an integration scope, an evaluation harness, and a fixed-fee Build estimate. Either side can walk at the end of Discovery with no further commitment. Build (8–14 weeks, milestone-fixed) ships the workflow with the AI components wired in. Integrate (3–5 weeks) connects to the CRM, ERP, HRIS, payment, identity, and any vertical systems. Pilot (3–4 weeks) runs the workflow live with a constrained user set. Operate is either an ongoing Care Plan or the operator runs it themselves after a 90-day exit window.
The engagement shape is the same in London, Berlin, Boston, and across our wider delivery footprint. What changes is the compliance overlay and the cost band.
The cost bands, plainly
For a real enterprise development engagement that ships AI in production in 2026, the rough cost bands are:
- UK / EU mid-market (50–500 person operator): Discovery £15k–£40k. Build £80k–£250k. Integrate £20k–£60k. Pilot £15k–£40k. First year of Operate (Care+ tier) £40k–£120k.
- UK / EU enterprise (500–5,000 person operator): Discovery £40k–£90k. Build £200k–£600k. Integrate £50k–£150k. Pilot £30k–£80k. First year of Operate (Enterprise Care) £150k–£400k.
- USA mid-market (50–500 person operator): Discovery $25k–$60k. Build $120k–$400k. Integrate $35k–$100k. Pilot $25k–$65k. First year of Operate $60k–$180k.
- USA enterprise (500+ person operator): Discovery $60k–$150k. Build $300k–$1M+. Integrate $80k–$250k. Pilot $50k–$130k. First year of Operate $250k–$650k.
These are real bands from real engagements. If you are getting AI-shop pitches well outside these — substantially cheaper, with the same scope, from a team you cannot reference — that is a red flag.
What the staffing model actually looks like
The team shape that ships is small and senior. A typical engagement carries one engineering lead (senior staff or principal level), two to four full-stack engineers, one AI/ML engineer, one integration engineer, and a part-time product manager. No army of consultants. No "AI strategist" outside of the Discovery phase. The total team is usually 5–7 people from kickoff to go-live, with the same team end-to-end so context does not get dropped between phases.
What does not work: the Big-4 "100 people on the bench" model where you talk to the partner and senior managers in pitch, then a rotating bench of junior developers actually delivers the work. This model is widely available in 2026 and widely buys nothing. AI-shaped workloads in particular are unforgiving of weak engineers; the model will silently hallucinate, the prompts will silently degrade, the evaluation harness will silently break — and a junior engineer will not catch it because they do not know what they are looking at.
On-premises AI as the deployment default
The other procurement question that has moved in 2026 is where the model actually runs. The standard 2023 answer was "OpenAI API, billed per token, the model runs in someone else's data centre". That answer is increasingly hard to defend for any operator with a sovereign-data clause, a sector regulator that cares about data residency, or a board that has noticed how much per-token billing scales when the workflow goes live.
The 2026 default for serious enterprise AI engagements is on-premises inference on open-weight models — Llama 3.x, Mistral, Mixtral, Qwen, DeepSeek — served via vLLM, Ollama, or TGI on the operator's own hardware. RAG against the operator's own knowledge base, with embeddings sitting in a vector store inside the operator's perimeter. The model never sees the public internet; the operator's data never leaves. Per-token cost flattens; the cap-ex on inference hardware pays back inside a year for any workload at meaningful volume.
This is not the right answer for every workload. Some workloads genuinely benefit from frontier-model capability that on-prem cannot match yet. But the right default has shifted, and any 2026 engagement that does not at least consider the on-prem option has skipped a real procurement question.
The regional compliance overlay
UK engagements run under UK GDPR and the FCA / PRA frameworks where relevant for financial services. EU engagements add GDPR, NIS2 (in force across member states in 2026), and the EU AI Act (in transitional enforcement — relevant for any model touching credit, hiring, education, biometrics, or public services). USA engagements run under the relevant federal frameworks (HIPAA for health, FFIEC for banking, FedRAMP for federal civilian, GLBA for finance) plus the state-level patchwork (California CCPA / CPRA, Colorado CPA, the emerging state AI laws).
For cross-border engagements — a UK-headquartered operator shipping to EU and US customers, a typical Zeour configuration — the architecture has to satisfy the strictest of the three. We usually start from a GDPR + EU AI Act baseline and add the FFIEC / HIPAA controls where they apply. Designing for the loosest market first and trying to retrofit is a common buyer mistake that costs three months and two re-architectures.
What buyers should refuse to sign
Three contract terms that show up in standard AI-consultancy paper that buyers should push back on. First, IP ownership clauses that leave training-data IP with the consultancy — non-starter; the operator owns the data, the embeddings, the fine-tunes, and the deployment. Second, "AI Centre of Excellence" retainers that bill for unspecified future capacity at consultancy rates — instead, negotiate a fixed-fee Care Plan tier with named engineers, billed quarterly, cancellable. Third, exclusivity clauses that prevent the operator from running other AI vendors — also a non-starter; AI is a multi-vendor world and a healthy operator will run model providers from at least two stables.
What we have shipped in 2026
In Q1 and Q2 2026 we have shipped AI-led enterprise development engagements across UK healthcare and retail, EU manufacturing and logistics, USA mid-market finance, and across GCC and MENA healthcare and government. Same engineering team across all of them, same fixed-fee phased model, same staffing shape. The compliance overlay shifts. The architecture does not.
If you are running a 2026 procurement for an AI-led enterprise development engagement and want a no-pitch scoping conversation with the engineers who would actually deliver it — that is what the first call is for. Most replies go out within one business day.
