Every government technology leader knows a story about a project that accidentally wasted a billion dollars. The details vary, but the arc is familiar.

A critical system needs modernization. To reflect the urgency, Congress and OMB allocate a large budget and assign it—along with a modernization mandate—to an agency. The agency appoints a team, often composed of well-meaning but non-technical program managers, contracting specialists, and analysts, who are often focused on many simultaneous projects. 

The process begins with procurement: most of the budget is awarded to a vendor through a large contract. Agencies often bundle all the work into a single contract and fast-track the process by attaching it to an existing contract, limiting bids, or streamlining evaluations. Once selected, the vendor spends six months or more “gathering requirements” — typically without a clear understanding of how the legacy system works, who uses it, what systems rely on it, or what “modernization” even means. Documentation is sparse, there are no automated tests, and no clear outcomes or measurable goals.

The team hasn’t audited current functionality to determine what’s still needed, what’s outdated, or what policy logic has become obsolete. Nor have they identified key assumptions and planned to thoroughly test these assumptions with users. (These assumptions get turned into business logic that is often hard-coded and rarely questioned—which turns into what we call “policy cruft.”) They also typically haven’t refined the underlying business processes. The tech landscape may have evolved, but instead of redesigning based on user needs or new possibilities created by tech ecosystem innovations, teams typically aim to replicate the old system, feature for feature, in new frameworks. There’s little incentive for government agencies to simplify or cut scope because the Impoundment Control Act prevents them from easily returning or reallocating unused funds — and instead of being rewarded for saving money or streamlining, agencies are sometimes penalized.

Lacking clear direction, the vendor does what any rational team would do: they hedge. They try to predict future needs and produce a massive, all-encompassing requirements document—often hundreds of pages long—designed to minimize change orders later. They expect priorities to shift with new leadership, so they try to plan for everything. The agency signs off on this requirements document, and the vendor embarks on some subset of the work they identified. The vendor continues work for months, years, often independently. The agency contract specialist performs “contract oversight” that typically involves checking a list of items off a list that the vendor proposed that the agency contractually bound them to. Technical staff, if they exist at all, are rarely involved. Agencies often lack engineers and product managers who can help align goals and control scope, and most contracting models don’t prioritize technical oversight. 

Years pass, the vendor bills the agency for hundreds of millions of dollars, and the agency continues to pay—because Congress and OMB already decided this problem was worth hundreds of millions. But as time passes and milestones accumulate, it becomes harder to say what “done” looks like.

Eventually, often under pressure from OMB or Congress, the agency asks to see progress. The vendor scrambles to demo what they’ve built—usually a small subset of the original requirements. If the agency prioritized functionality, it might be the right subset; if not, it’s whatever the vendor chose. Often, this is the first time real users see the new system. Real user testing—years in—exposes flawed or outdated assumptions and reveals what actually is needed and what actually works. 

The feedback is often negative. The vendor adjusts course, which can mean re-architecting systems already built around outdated assumptions. Change requests balloon. So do costs. Scope expands with each new insight and each new stakeholder. Projects drag on for years, inching past the billion-dollar mark.

Eventually, the agency hits a budget cliff. They launch an imperfect system under pressure. More users interact with it. More feedback rolls in. The list of bugs and missing features grows. And the budget grows with it.

This billion-dollar modernization failure story plays out repeatedly in governments and large organizations. The combination of a large budget, urgency, lack of prioritization, and limited technical expertise not only drives regular project failures, it also inflates costs, turning work that should cost millions into billion-dollar efforts. In aggregate, I believe this pattern has cost the U.S. federal government trillions and will continue to do so until the root causes are addressed.

The solution isn’t more pressure, more budget, firing people, or indiscriminate contract cancellations. The solution is to get the public-private implementation relationship right. This means streamline and improve how government and vendors work together, reform budget and procurement processes, and embed tech industry best practices like agile and human-centered design into planning, solicitation, and contract management. 

Case study: The Social Security Administration’s call center

In 2017, the Social Security Administration (SSA) faced mounting problems with its aging, complex telephone system and long wait times. To address this, SSA launched the Next Generation Telephony Project (NGTP) – a large modernization effort to unify all phones under one system. A team of program managers and contracting specialists drafted a solicitation and technical requirements, incorporating staff input but not user feedback. The competitive bid went to Verizon, whose win was contested — the first of many delays.

Lacking in-house telecom expertise, SSA’s team didn’t realize the solution Verizon proposed, reinforced by SSA’s own contract requirements, was based on architectural components that were a generation behind leading on-premise systems. Nobody challenged the assumption that a new solution should be on-premise, or built upon custom hardware and software. NGTP’s 10-year planning horizon meant any solution would be obsolete before full deployment.

By 2020, with the project still in early development, the COVID-19 pandemic forced SSA call center agents to work remotely — a capability the existing legacy system lacked. Verizon scrambled to assemble a custom stopgap solution, but this was plagued with issues. From May 2021 to December 2022, over 40 service disruptions caused dropped calls, long wait times, and outages. At times, up to 80% of calls went unanswered as the team capped incoming calls to maintain system stability. 

Meanwhile, NGTP suffered further delays and technical hurdles. SSA executives were frustrated but assumed they were contractually stuck. The system finally launched in December 2023 for the 800-number only, delivering just part of the promised functionality. The cost: over $400 million for ~5,000 agents — several times higher than comparable private-sector implementations. Ongoing costs included $50+ million annually for connectivity, support, and maintenance, with operational changes requiring a cumbersome Verizon change-order process.

While SSA and Verizon spent 5 years struggling with the NGTP project’s bugs, scaling issues, and outdated, fragile architecture, the private sector was steadily shifting to cloud-based Contact Center as a Service (CCaaS) platforms, which were cheaper, scalable, and more resilient. By late 2023, CCaaS platforms were a $5B market, offering cloud-based, feature-rich, and flexible solutions. CCaaS enterprise deployments typically cost private sector companies $250K–$2M to set up and $5M–$25M annually for 5,000-agent operations. These off-the-shelf solutions came with battle-tested reliability, CRM integrations, and pricing based on usage. Had SSA or Verizon prioritized technical oversight and challenged assumptions embedded in the NGTP project planning process, SSA might have spent orders of magnitude less for a more effective solution.  

This case highlights the risks of large modernization projects undertaken without sufficient technical expertise. When significant budgets meet urgency and limited in-house capability, costs and risks balloon. The takeaway isn’t to fault dedicated civil servants, but to recognize that different challenges require different skills — and that better alignment of expertise with key decisions is crucial to delivering scalable, effective public services.

Even in tight budget environments, placing technical experts where critical implementation choices are made can avoid nine-figure costs for seven-figure solutions.

More case studies

The U.S. Air Force terminated its Expeditionary Combat Support System (ECSS) in 2012 after investing $1.03 billion since 2005. The ERP project aimed to replace over 200 legacy systems but failed to deliver significant military capabilities. The Air Force concluded that completing the project would require an additional $1.1 billion for only a quarter of the original scope, with deployment not expected until 2020. Subsequently, the Air Force decided to cancel the program.

In 2001, the FBI’s Virtual Case File (VCF) was an attempt to modernize its ancient, paper-based case management system and give agents a digital, secure way to open, manage, and share case files across the Bureau. By 2005, after $170+ million spent and 4 years in development, only a fraction of the system worked — and what did work was riddled with security vulnerabilities and functional gaps. In 2005, Sentinel was the FBI’s effort to replace the failed VCF system and modernize how it managed case files. Originally launched in 2006, the project was estimated to cost around $425 million and be delivered in four years. It ended up costing over $451 million and took six years to fully deploy. A 2010 pivot to agile methods and bringing work in-house allowed the FBI to finally deliver a scaled-down version in 2012.

Deloitte and other vendors built Medicaid eligibility and enrollment systems for at least 25 U.S. states, with total spending exceeding $6 billion. While intended to streamline program administration, many states reported persistent problems, including eligibility errors, system inflexibility, and costly delays. In Georgia, 35 system change requests in 2023 were projected to take over 104,000 hours. In Texas, a Federal Trade Commission complaint alleged Deloitte’s software wrongly cut hundreds of thousands from Medicaid. A federal lawsuit alleges that Florida’s Deloitte-run computer system incorrectly cut off Medicaid coverage for new mothers, despite their eligibility for continuous 12-month coverage, due to technical issues. Alaska faced issues with eligibility errors, wrongful benefit terminations, and slow, expensive fixes. In September 2023, federal officials announced nearly 500,000 people — many of them children — would regain Medicaid or CHIP coverage after 30 states were found to have improperly vetted household eligibility post-pandemic, often disenrolling children when parents failed to return forms. These problems have fueled growing scrutiny and legal challenges across multiple states.

The “Better FAFSA” project is the U.S. Department of Education’s initiative to simplify and modernize the Free Application for Federal Student Aid (FAFSA) process. Mandated by the FAFSA Simplification Act of 2020 and the FUTURE Act of 2019, the project aimed to make the application more accessible and user-friendly, reducing the number of questions, streamlining income data transfer from the IRS, and expanding Pell Grant eligibility. Despite spending over $300 million, the rollout faced numerous challenges.  The project was slow to transition from supposedly simpler policy to the implementation phase, and then implementation was rushed, extremely technically complex, and poorly coordinated. Shifting requirements, tight deadlines, and limited user testing led to delays, glitches, and data errors that disrupted financial aid processing for millions of students. To address the issues, the Department extended application deadlines, implemented system fixes, and manually corrected errors. Despite these measures, the disruption led to a drop in FAFSA submissions, especially among low-income applicants. 

The U.S. does not have a monopoly on technology procurement anti-patterns. The UK spent over £10 billion on a national health IT transformation that was described by parliamentarians as ‘one of the worst fiascos ever.’ Canada spent at least $4 billion on Phoenix, a public sector payroll system refresh where 8 years after implementation, about a third of employees are still reporting payroll errors. Another payroll system refresh in Queensland, Australia, started with a $6 million IBM contract, and ended up with a $1.25 billion bill – before being described as ‘the worst failure of public administration in this nation’

What do all of these examples have in common? 

These projects involved critical highly scaled benefits systems built many years ago with large upfront budgets but minimal maintenance funding. Over time, the systems became increasingly fragile and harder to manage, until so many users and stakeholders complained that government leaders declared an emergency to “modernize” the system. This is a common government funding pattern: high initial investment, chronic underfunding of maintenance, and eventual emergency-level spending on “modernization.” It’s a hallmark of project-based, rather than product-based, funding, as described by Jen Pahlka in the article “Project vs Product Funding”.  

Each project suffered from overly ambitious scope, shifting requirements, and a “big bang” delivery approach reinforced by congressional budgeting. Procurement involved complex technical decisions but lacked agency expertise to manage costs, integrate systems, and control complexity. With minimal input from users — citizens or staff — the systems often failed to meet real needs, scaled poorly, and saw low adoption. Agency leaders struggled to grasp the technical challenges and support the work effectively, while Congress held unrealistic expectations that risks could be eliminated through large, upfront investments.

Most projects lacked product managers. In the private sector, product management is a well-defined career track, typically with one product manager per engineering team. Product managers align business strategy, user needs, and technical delivery—setting priorities and focusing efforts on real outcomes. But U.S. government agencies employ very few product managers, so this work is often left to non-technical staff like contract officers or governance committees or neglected. Contract oversight emphasizes compliance over outcomes, and committees often struggle to make concrete decisions within constraints—leading to bureaucratic gridlock and a lack of ownership over delivery.

Strategic decisions and prioritization were too heavily outsourced. While public-private partnerships are essential, agencies must retain the ability to set scope, prioritize, align work to goals, and manage complexity. Vendors operate under private-sector incentives and cannot fully internalize public-sector priorities. When agencies deal with unclear project goals by deferring all key decisions to vendors, costs often balloon to match arbitrary budget ceilings set by Congress or OMB.

Over time, vendor lock-in becomes inevitable. Agencies award large, sometimes sole-source contracts to major IT vendors, leading to custom, monolithic systems that are hard to update or replace—driving costs ever higher.

Solutions

​​The U.S. federal government is not alone in struggling with large implementation and modernization projects. Large organizations everywhere—public and private—face similar challenges, even in environments with easier access to technical talent, more flexible budgets, and fewer regulatory hurdles. Across sectors, organizations struggle to get fair prices for building products, migrating systems, and supporting IT infrastructure.

However, proven best practices exist to reduce risk, control costs, and improve outcomes. Adopting them could save the government billions on modernization efforts.

1. Leaders must treat technology as central to strategy.

Technology underpins virtually all government services. Yet it’s still often treated as secondary to policy or too in the weeds for senior leaders to engage with. Bad technology strategy choices can lead projects not just to fail, but to fail and cost billions. Government must include technical strategy in planning and budgeting processes, and executive branch and legislative branch leaders must be accountable for implementation. Tech leaders should be at the table for policy decisions, and senior officials must insist on clear, plain-language explanations from agency tech teams and vendors to make confident, informed choices.

2. Fund technology as ongoing products, not one-time projects.

Congress and OMB should treat technology investments like products with evolving needs — not fixed, one-time purchases. Software constantly changes with user needs, policy shifts, and tech advances. The outdated federal model of underfunding maintenance and overfunding emergency “modernizations” drives up costs and delivers poor results. In a rapidly advancing tech market, $100M buys far more today than it did a decade ago. Budgeting and appropriations processes should acknowledge that buying or building technology is not like purchasing helicopters or office supplies and fund capacities with a product instead of project model so that government can get the most for its money. 

3. Agencies should build in-house technical capacity and reform IT procurement.

Modern tech projects require cross-functional, skilled teams managing flexible contracts with sustained funding. Agencies need technical experts involved in both procurement and implementation. Traditional compliance-driven contract oversight processes work for buying goods — not for building complex IT systems. Agencies should adopt proven private-sector practices: agile development, iterative budgeting, iterative delivery, human-centered design, and empowered product management. 

4. Implementation leads should relentlessly manage and minimize scope.

Government overspends on technology because there are many ways that scope balloons in the procurement and implementation management processes. Costs and risk can be controlled by rigorously controlling scope.

To better control scope:

  • Instead of long, upfront “requirements gathering” processes after procurement decisions, set clear, outcomes-based goals before procurement. Contracts should define the problem to solve (e.g., “reduce claim processing time by 30%”) — not just list technology features. Contract management should include regular feedback loops to track whether goals are being met, not just whether tasks are completed.
  • Empower and incentivize agency tech staff to research solution spaces and identify where savings and improvements are possible.
  • Conduct early, frequent user research to test assumptions before they harden into expensive implementation plans.
  • Assess system migration costs before rebuilding, mapping data flows, dependencies, and business processes to reduce surprises.
  • Explore architectural improvements early — avoid “lift and shift” projects in favor of modularization, APIs, off-the-shelf tools, and microservices where appropriate.
  • Deliver in small, working increments, using agile or iterative approaches with pilot launches before national rollouts.

Government must invest in internal technical leadership. The agencies best equipped to deliver modern, affordable, and effective services are those with in-house teams that understand both the technology and the mission — and can lead, not just manage, vendor work. A few million spent on technical capacity can save billions in implementation costs.

Calls for reform

Many best practices for effective technology implementation conflict with how the U.S. federal government is structured. Congress and OMB often signal a project’s priority by assigning it a larger budget. But more money doesn’t always mean better outcomes. For example, if SSA’s call center could be modernized for $2M or $20M, what value does a $200M appropriation add? Still, would any agency say no to $200M? Budgeting processes should reward agencies for finding modern, efficient solutions and saving money rather than rushing to spend by year’s end.

Congress and OMB should reform budgeting to incentivize outcomes, not just spending. Funding ongoing technology products and capabilities, not one-time projects, would better support effective delivery. In the meantime, executive branch leaders can still choose to prioritize value and outcomes over budget maximization, even within the current constraints.

Success is possible

Despite these challenges, sometimes governments get modernization right. 

VA.gov was able to drastically improve the veteran experience via successful modernization work. Before modernization, veterans faced a fragmented, confusing digital experience when accessing VA services online. Beginning in 2013, VA built a Digital Service team to redesign its digital services around a human-centered, service-based strategy. They consolidated scattered websites into a single, user-friendly VA.gov platform, using agile methods, real veteran feedback, and incremental system replacements. The project was delivered at a fraction of the cost of typical federal IT efforts, relying on small, agile government teams and a few carefully chosen vendors — avoiding the usual large-system integrator bloat. As a result, 92% of veterans now report a positive experience with VA.gov, and online benefit applications have surged.

Universal Credit (UC) was the UK’s largest welfare reform since 1948, combining six benefits into one program. Initially, it struggled: after three years, £425 million had been spent, five program leaders had come and gone, and not a single claimant had been successfully served.  The program was successfully turned around by abandoning a rigid, top-down delivery model and replacing it with a small, cross-disciplinary team that took a test-and-learn, outcomes-first approach. By the time COVID-19 hit, UC was serving over five million UK households. The system absorbed a massive surge in demand without breaking, thanks to its flexible, user-centered design and adaptable policy model — a testament to the power of building public services through continuous iteration and direct user feedback. Read more about this case study in Jen Pahlka and Andrew Greenway’s State Capacity paper. 

In 2024, SSA moved to a CCaaS platform, successfully launching in just 48 days. The new system immediately improved 800# performance and set the stage for future wins. Digital transformation experts at the agency followed proven best practices: developing a service and technology strategy based on customer and agent feedback, appointing a small, dedicated product team with a clear product lead, and piloting an AWS Connect solution in a live test environment before scaling. Where in-house expertise was lacking, they brought in targeted engineering and procurement support, with a plan to build internal capacity over time.

Summary

The persistent pattern of billion-dollar technology modernization failures in government stems not from a lack of good intentions, but from structural misalignments in incentives, expertise, and decision-making authority. When large budgets meet urgency, limited in-house technical capacity, and rigid, compliance-driven procurement processes, projects become over-scoped and detached from the needs of users and mission outcomes. This undermines service delivery, wastes taxpayer dollars, and adds unnecessary risk to critical systems supporting national security and public safety.

We know what causes failure, we know what works, and we’ve proven it before. It isn’t easy and shortcuts don’t work — but success is entirely achievable, and that should be the expectation. The solution is not simply to spend more, or cancel contracts, or fire people, but to fundamentally rethink how public institutions build and manage technology, and rethink how public-private partnerships are structured. Government services underpinned by technology should be funded as ongoing capabilities rather than one-time investments, IT procurement processes should embed experienced technical leadership where key decisions are made, and all implementation projects should adopt iterative, outcomes-driven approaches. 

Proven examples—from VA.gov to SSA’s recent CCaaS success—show that when governments align incentives, prioritize real user needs, and invest in internal capacity, they can build services faster, for less money, and with dramatically better results. We can have a lean, user-focused government that delivers on its promises to the public.

Thank you for being interviewed for this article to: Betsy Beaumon, Andrew Greenway, Mina Hsiang, Luke Farrell, Marina Nitze, Waldo Jaquith, Pooja Shaw, Jonathan Mayer