Pilot failure

Introduction

Government AI pilots often look impressive in controlled demonstrations and then quietly stall when they meet the realities of public administration. A chatbot can summarise documents in a test environment, an algorithm can detect fraud in a limited dataset, or a generative AI assistant can help civil servants draft reports. But turning a successful demo into dependable public capability is far harder than proving that a model can perform one task.

Pilot failure illustration 1 The problem is not simply technical immaturity. Public-sector AI projects frequently collide with fragmented data systems, rigid procurement rules, legal accountability requirements, weak internal technical capacity, and workflows designed long before machine learning existed. The result is what many officials call “pilot purgatory”: endless experimentation without durable institutional change. The National Audit Office, parliamentary committees, OECD reviews, and public-sector researchers all describe versions of the same pattern: governments are learning how to run pilots faster than they are learning how to absorb AI into real services. OECD [3National Audit Office (NAO]nao.org.uke Pub Use of artificial intelligence in government.epubfailure and drive continuous improvement. In piloting AI solutions for the public sector, government should expect failures and mechanism… [3National Audit Office (NAO]nao.org.uke Pub Use of artificial intelligence in government.epubfailure and drive continuous improvement. In piloting AI solutions for the public sector, government should expect failures and mechanism…

This matters well beyond administrative efficiency. If advanced AI is ever to support scientific acceleration, better healthcare systems, climate coordination, resilient infrastructure, or broader human flourishing, governments will need the capacity to use intelligence reliably at scale. Societies cannot reach an “AI bloom” future through isolated prototypes alone.

The difference between a demo and a public capability

A successful pilot proves that an AI system can work somewhere, under some conditions, for some users. A public capability means something much larger: reliability, accountability, integration, continuity, funding, legal compliance, maintenance, staff training, and public trust over years rather than weeks.

That distinction explains why demonstrations routinely mislead decision-makers.

In a pilot, teams often use carefully prepared data, enthusiastic staff, temporary workarounds, and narrow objectives. Edge cases are excluded. Human oversight is unusually intense. Political attention is high. Vendors may provide direct support that would be impossible at national scale. Once deployment expands, those protections disappear.

The hardest part of public-sector AI is usually not the model itself but the surrounding institutional environment. The UK National Audit Office found that outdated technology systems, fragmented data, and shortages of skilled personnel were major barriers to government AI adoption. [National Audit Office (NAO]nao.org.uke Pub Use of artificial intelligence in government.epubfailure and drive continuous improvement. In piloting AI solutions for the public sector, government should expect failures and mechanism… The UK government’s own review of digital capability similarly reported that fragmented and poorly coordinated data infrastructure was holding back AI and advanced analytics across departments. [GOV.UK]GOV.UKState of digital government reviewThis holds back AI, machine learning, and advanced analytics. Only 27% of survey respondents believe their…Read more…

This is why many public-sector AI systems remain trapped in low-risk administrative assistance roles. Summarising documents or helping staff search internal records is easier than integrating AI into benefits systems, healthcare coordination, tax administration, or planning enforcement, where mistakes can affect rights, money, and legal outcomes.

The gap between prototype and institution also reflects different standards of failure. Consumer software companies can tolerate occasional errors or rapid iteration. Governments often cannot. A mistaken recommendation in a music app is inconvenient; a mistaken denial of welfare support or immigration status can be catastrophic. Public systems therefore require audit trails, appeals processes, transparency obligations, and legal defensibility that many AI tools were not originally designed to provide.

Legacy systems turn AI into an infrastructure problem

Many governments still run critical services on decades-old infrastructure. AI systems are therefore being layered onto institutions whose data structures were never designed for interoperability or machine learning.

This creates several interconnected problems.

Fragmented and low-quality data

AI systems depend on consistent, traceable, well-labelled data. Governments often possess enormous quantities of information but in incompatible formats spread across separate departments, contractors, and databases.

The UK government’s digital capability review found that only a minority of officials believed existing infrastructure provided a unified operational view. [GOV.UK]GOV.UKartificial intelligence playbook for the uk government htmlsystem errors or failures. Seek legal advice to determine whether your AI project aligns with existing legislative frameworks or requires… OECD analyses similarly warn that weak data governance keeps many AI initiatives trapped at the pilot stage because systems cannot reliably share or validate information across agencies. [OECD]oecd.orgOECDImplementation challenges that hinder the strategic use of…18 Sept 2025 — Without strong data governance in place, governments ris…

Public records are especially difficult because they are shaped by decades of legal and bureaucratic evolution rather than technical consistency. Different agencies may use incompatible identifiers, inconsistent terminology, or incomplete historical records. AI models can expose these weaknesses rapidly.

This is one reason pilots sometimes look better than production deployments. Trial systems are often tested on unusually clean datasets assembled specifically for the experiment. Real government environments contain missing entries, contradictory records, scanning errors, legacy formats, and sensitive information governed by strict privacy rules.

Old infrastructure creates hidden integration costs

A pilot may operate as a standalone tool. Real deployment requires integration into existing workflows, databases, procurement systems, identity systems, cybersecurity standards, and archival rules.

That integration work is expensive and politically invisible. Ministers and senior officials are often rewarded for announcing innovation rather than rebuilding underlying infrastructure. Yet modernising data architecture may matter more than the AI layer itself.

Researchers studying public-sector AI procurement in US cities found that decades-old procurement norms and technical structures heavily shape which AI systems governments can realistically adopt and govern. [arXiv]arxiv.orgarXivLegacy Procurement Practices Shape How U.S. Cities Govern AI: Understanding Government Employees' Practices, Challenges, and NeedsNo… In practice, institutions frequently discover that deploying AI safely requires redesigning surrounding systems first.

This is why apparently modest capabilities can take years to operationalise. The obstacle is rarely that the model cannot generate useful outputs. The obstacle is that government systems cannot absorb those outputs reliably.

Procurement rules were built for roads and paperwork, not adaptive AI

Public procurement systems were largely designed for predictable infrastructure and long-term contracts, not rapidly evolving software models.

That mismatch creates a structural problem for AI adoption.

Traditional procurement favours detailed specifications, fixed deliverables, long approval cycles, and low tolerance for uncertainty. AI systems evolve quickly, depend on changing data environments, and often require iterative deployment. By the time a procurement process finishes, the original technical assumptions may already be outdated.

The result is a paradox. Governments are expected to move cautiously because they spend public money and must ensure fairness. But excessive rigidity can lock institutions into obsolete tools or prevent learning entirely.

The parliamentary Public Accounts Committee warned in 2025 that UK government departments were running many disconnected AI pilots without strong mechanisms for sharing lessons or scaling successful approaches. [UK Parliament]publications.parliament.ukUK Parliament Use of AI in Governmentfailed. To grasp the opportunities of AI, government must also learn from pilots and scale up the most promising examples. However, there… This fragmentation increases duplication, procurement complexity, and dependence on vendors.

Vendor dependence is itself a major concern. Many governments lack enough in-house technical expertise to evaluate claims made by AI providers. That creates risks of “black box” procurement in which officials buy systems they cannot fully audit, maintain, or adapt.

Academic studies on AI procurement governance have highlighted recurring tensions between innovation pressure and meaningful accountability. Checklist-style governance processes may appear rigorous while failing to ensure that officials truly understand model limitations, bias risks, or operational failure modes. [arXiv]arxiv.orgarXivLegacy Procurement Practices Shape How U.S. Cities Govern AI: Understanding Government Employees' Practices, Challenges, and NeedsNo…

Governments also face procurement incentives that favour launching pilots over sustaining systems. Short-term innovation funds are easier to secure politically than long-term maintenance budgets. This encourages experimentation without institutional commitment.

Pilot failure illustration 2

Staffing problems are often more important than model quality

Many governments do not have enough experienced AI engineers, data architects, cybersecurity specialists, procurement experts, or technically literate managers to support large-scale deployment.

The National Audit Office reported that difficulties recruiting and retaining AI-skilled staff were among the most common barriers to adoption in UK government. [National Audit Office (NAO]nao.org.uke Pub Use of artificial intelligence in government.epubfailure and drive continuous improvement. In piloting AI solutions for the public sector, government should expect failures and mechanism… Public-sector pay structures frequently struggle to compete with private technology firms, especially for experienced machine-learning specialists.

But the problem goes beyond hiring elite researchers.

Public institutions often lack enough mid-level operational expertise: people who understand both policy processes and technical systems. Successful deployment depends heavily on translators who can connect legal obligations, frontline workflows, and engineering realities.

Without these bridging capabilities, organisations frequently deploy tools that optimise the wrong metrics or fail to fit real administrative work. RAND researchers identified this mismatch between technical optimisation and organisational workflow as one of the leading causes of AI project failure. [RAND Corporation]rand.orgRAND CorporationThe Root Causes of Failure for Artificial Intelligence…August 13, 2024 — 13 Aug 2024 — Second, many AI projects fail b…Published: August 13, 2024

Training gaps also create cultural resistance. Frontline workers may distrust systems imposed from above, especially if earlier digital reforms increased workload or reduced autonomy. Officials asked to use AI tools without adequate training often create informal workarounds that undermine standardisation and accountability.

This matters because public capability is social as well as technical. A functioning government service depends on routines, incentives, tacit knowledge, and institutional trust. AI cannot simply be “plugged into” those systems from outside.

Maintenance and workflow integration decide what survives

The systems that survive are usually not the most futuristic. They are the ones that fit everyday institutional routines.

This is why narrow augmentation tools often outperform grand automation promises.

An AI system that helps tax investigators prioritise suspicious cases may succeed because it fits existing human workflows. A fully automated enforcement system may fail because legal appeals, edge cases, and accountability requirements become unmanageable.

Maintenance is similarly underestimated.

AI systems degrade over time. Data distributions change. Policies evolve. Laws are updated. User behaviour shifts. Models require monitoring, retraining, auditing, cybersecurity reviews, and operational support. A flashy pilot may hide these long-term obligations.

Many institutions fund experimentation but not stewardship. Once pilot funding expires, systems are left without permanent teams or operational budgets. The AI tool then quietly disappears even if the underlying technology worked reasonably well.

The Ada Lovelace Institute has argued that governments still lack robust structures for evaluating what actually works in context and for systematically learning from failure. [adalovelaceinstitute.org]adalovelaceinstitute.orgpublic sector aiLearn fast and build things14 Mar 2025 — The public sector needs structures to learn from achievements and failures… AI systems are fa… That weakness matters because deployment success depends heavily on organisational adaptation rather than model capability alone.

Workflow integration also determines legitimacy. Public services are not only technical systems; they are democratic systems. Citizens need ways to challenge errors, understand decisions, and access human review when automated processes fail. AI that increases administrative speed while weakening procedural fairness can trigger backlash strong enough to halt deployment entirely.

Pilot failure illustration 3

Why this matters for the larger AI future

The repeated failure of government AI pilots may seem mundane compared with speculation about superintelligence or post-scarcity economies. But these institutional bottlenecks are central to whether advanced AI ultimately expands human flourishing broadly or merely creates isolated pockets of productivity.

The optimistic vision of AI abundance assumes societies capable of coordinating large-scale transitions: modernising healthcare, accelerating scientific research, upgrading energy systems, adapting infrastructure, distributing gains widely, and maintaining democratic legitimacy under rapid technological change.

Weak public institutions could become one of the largest constraints on that future.

If governments cannot integrate relatively narrow administrative AI systems today, then managing far more powerful future systems will be harder still. Advanced AI may increase the importance of state capability rather than making institutions irrelevant. Scientific breakthroughs, automation, and intelligence abundance still need reliable governance, procurement, regulation, dispute resolution, education systems, and infrastructure planning.

The lesson from failed pilots is therefore not that AI lacks potential. It is that technological capability alone does not produce institutional transformation.

A civilisation capable of benefiting from advanced AI at scale will probably need stronger administrative foundations than many governments currently possess: interoperable data systems, technically skilled public workforces, adaptive procurement rules, transparent oversight, resilient digital infrastructure, and organisations able to learn continuously from failure rather than repeatedly restarting from isolated demos.

Endnotes

Source: publications.parliament.uk
Title: UK Parliament Use of AI in Government
Link: https://publications.parliament.uk/pa/cm5901/cmselect/cmpubacc/356/report.html
Source snippet
failed. To grasp the opportunities of AI, government must also learn from pilots and scale up the most promising examples. However, there...
Source: oecd.org
Link: https://www.oecd.org/en/publications/2025/06/governing-with-artificial-intelligence_398fa287/full-report/implementation-challenges-that-hinder-the-strategic-use-of-ai-in-government_05cfe2bb.html
Source snippet
OECDImplementation challenges that hinder the strategic use of...18 Sept 2025 — Without strong data governance in place, governments ris...
Source: GOV.UK
Title: State of digital government review
Link: https://www.gov.uk/government/publications/state-of-digital-government-review/state-of-digital-government-review
Source snippet
This holds back AI, machine learning, and advanced analytics. Only 27% of survey respondents believe their...Read more...
Source: arxiv.org
Link: https://arxiv.org/abs/2411.04994
Source snippet
arXivLegacy Procurement Practices Shape How U.S. Cities Govern AI: Understanding Government Employees' Practices, Challenges, and NeedsNo...
Source: arxiv.org
Link: https://arxiv.org/abs/2404.14660
Source snippet
arXivAI Procurement Checklists: Revisiting Implementation in the Age of AI GovernanceApril 23, 2024...

Published: April 23, 2024
Source: rand.org
Link: https://www.rand.org/pubs/research_reports/RRA2680-1.html
Source snippet
RAND CorporationThe Root Causes of Failure for Artificial Intelligence...August 13, 2024 — 13 Aug 2024 — Second, many AI projects fail b...

Published: August 13, 2024
Source: adalovelaceinstitute.org
Title: public sector ai
Link: https://www.adalovelaceinstitute.org/policy-briefing/public-sector-ai/
Source snippet
Learn fast and build things14 Mar 2025 — The public sector needs structures to learn from achievements and failures... AI systems are fa...
Source: oecd.org
Link: https://www.oecd.org/en/publications/2025/06/governing-with-artificial-intelligence_398fa287/full-report/how-artificial-intelligence-is-accelerating-the-digital-government-journey_d9552dc7.html
Source snippet
It situates government as a developer...Read more...
Source: oecd.org
Link: https://www.oecd.org/content/dam/oecd/en/publications/reports/2019/09/state-of-the-art-in-the-use-of-emerging-technologies-in-the-public-sector_2b6dacca/932780bc-en.pdf
Source snippet
The current wave of digital transformation in delivering policy and designing services.Read more...
Source: GOV.UK
Title: artificial intelligence playbook for the uk government html
Link: https://www.gov.uk/government/publications/ai-playbook-for-the-uk-government/artificial-intelligence-playbook-for-the-uk-government-html
Source snippet
system errors or failures. Seek legal advice to determine whether your AI project aligns with existing legislative frameworks or requires...
Source: nao.org.uk
Title: e Pub Use of artificial intelligence in government.epub
Link: https://www.nao.org.uk/wp-content/uploads/2024/03/ePub-Use-of-artificial-intelligence-in-government.epub
Source snippet
failure and drive continuous improvement. In piloting AI solutions for the public sector, government should expect failures and mechanism...
Source: nao.org.uk
Title: use of artificial intelligence in government
Link: https://www.nao.org.uk/wp-content/uploads/2024/03/use-of-artificial-intelligence-in-government.pdf
Source snippet
National Audit Office (NAO)Use of artificial intelligence in government15 Mar 2024 — Our survey found that difficulties recruiting or ret...

Additional References

Source: nortal.com
Link: https://nortal.com/insights/ai-in-govtech-use-cases-research
Source snippet
AI in GovTech use casesOver 70% of public sector AI initiatives fail to move beyond the pilot stage. Not because the technology isn't rea...
Source: linkedin.com
Link: https://www.linkedin.com/posts/maximevermeir_mit-study-why-95-of-ai-pilots-fail-how-activity-7365291815626760192-45Br
Source snippet
Why 95% of AI Pilots Fail & How to Join the 5% That WinThe 95% failure rate isn't about the tech. Companies rush into AI without clear bu...
Source: dispatches.alanbrown.net
Link: https://dispatches.alanbrown.net/p/digital-economy-dispatch-230-the-challenges-of-ai-adoption-in-the-uk-public-sector
Source snippet
Challenges of AI Adoption in the UK Public Sector6 Apr 2025 — Adopting and scaling AI technology in government faces three interconnected...
Source: www3.weforum.org
Title: WEF AI Procurement in a Box Pilot case studies from the United Kingdom 2020
Link: https://www3.weforum.org/docs/WEF_AI_Procurement_in_a_Box_Pilot_case_studies_from_the_United_Kingdom_2020.pdf
Source snippet
Procurement in a Box: Pilot case studies from the United...A government-wide open data catalogue or a data dictionary with all relevant...
Source: forbes.com
Title: ai pilots fail not because of algorithms but because of weak foundations
Link: https://www.forbes.com/councils/forbestechcouncil/2025/11/21/ai-pilots-fail-not-because-of-algorithms-but-because-of-weak-foundations/
Source snippet
Why AI Pilots Fail: Weak Foundations, Not Algorithms21 Nov 2025 — Gartner estimates that "at least 30% of generative AI (GenAI) projects...
Source: elsewhen.com
Link: https://www.elsewhen.com/reports/taking-aim-generative-ai-maturity-in-the-public-sector/
Source snippet
Taking AIM: Generative AI Maturity in the Public SectorWe've developed the GenAI Maturity Index (AIM) tailored specifically for the publi...
Source: nrdcompanies.com
Link: https://www.nrdcompanies.com/insights/oecd-aieport-insights-and-lessons-you-need-to-know/
Source snippet
It is about creating policies that set clear rules and boundaries for AI – ensuring...Read more...
Source: researchgate.net
Link: https://www.researchgate.net/publication/384112060_Barriers_to_the_implementation_of_artificial_intelligence_in_small_and_medium_sized_enterprises_Pilot_study
Source snippet
ementation of Artificial Intelligence (AI) in small and medium-sized companies (SMEs).Read more...
Source: ey.com
Title: how a five step roadmap helps governments succeed with ai
Link: https://www.ey.com/en_uk/insights/government-public-sector/how-a-five-step-roadmap-helps-governments-succeed-with-ai
Source snippet
How a five-step roadmap helps governments succeed with AI27 Oct 2025 — Many governments struggle to scale AI beyond pilots due to deploym...
Source: reddit.com
Link: https://www.reddit.com/r/cscareerquestions/comments/1muu5uv/mit_study_finds_that_95_of_ai_initiatives_at/
Source snippet
"MIT Study finds that 95% of AI initiatives at companies fail...[https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots..."](https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots...")...

Amazon book picks

Why AI pilots stall

Introduction

The difference between a demo and a public capability

Legacy systems turn AI into an infrastructure problem

Fragmented and low-quality data

Old infrastructure creates hidden integration costs

Procurement rules were built for roads and paperwork, not adaptive AI

Staffing problems are often more important than model quality

Maintenance and workflow integration decide what survives

Why this matters for the larger AI future

Endnotes

Additional References

Further Reading

An Introduction to Public Administration

Public Administration: Concepts And Theories

Digital Government

Public administration

Follow this branch

Parent topic

Related pages 2