6 ways to make oversight work for state capacity

President Trump’s dismissal of 17 federal inspectors general en masse; high-profile IG investigations of alleged mismanagement at two Cabinet departments; a high-stakes controversy over the Government Accountability Office’s finding of Impoundment Control Act violations — over the past year and a half, the behind-the-scenes work of the federal government’s oversight entities has never been so front and center.

That makes it a good time to ask: What exactly is the purpose of this ordinarily low-key government function with the ability to shape public policy, public programs, and public opinion?

Discussion of the benefits of oversight — particularly the work done by executive branch IGs and the GAO, known as the “congressional watchdog” — often focuses on cost savings, an area in which federal oversight excels. The GAO, for instance, last year reported that it had identified $62.7 billion in cost savings to the government in FY2025, for an ROI of 68:1 — that is, for every government dollar allocated to GAO, it produced $68 in savings. Meanwhile, the 70 federal IGs collectively identified $65.6 billion in potential savings to the government in FY2025, producing a return on investment of 17:1.

These are impressive numbers, but the impact of oversight reverberates far beyond dollars and cents. The GAO, the IGs, and Congress’s own oversight shape incentives for agencies’ activities, communicate priorities, and signal to public servants the metrics to which they and their programs will be held accountable. Their work can profoundly shape the culture and norms of agencies and affect the federal government’s ability to serve citizens effectively.

‘Impact evaluation’

One of the areas in which oversight entities are well-positioned to influence agencies for the better is in their ability to provide candid, evidence-based assessments of how well agency programs are working. Oversight can essentially function as an impact evaluation team, communicating what’s working in a program and what isn’t.

But for oversight to be an effective part of the feedback loop, it must be geared toward agencies’ mission-relevant outcomes. And that’s not always the case, or even the default.

Despite the many benefits it generates for agencies and taxpayers, oversight as currently practiced often falls short of its potential to improve government performance and build state capacity. Through excessive emphasis on process compliance and fixation on failures, oversight too often pushes agencies toward unswerving proceduralism and away from productive, calculated risk taking, which in turn can impede the agency from delivering on its mission.

Reorienting oversight to better support state capacity doesn’t require that oversight practitioners stop caring about risk; it calls for rethinking what the greatest risks really are. And that means taking a longer view — not just about the risks of noncompliance today, but about the long-term risks and costs of compliance with poorly designed policies and procedures.

Thanks to their independence, their outsider’s view of programs, relentless focus on accountability, and ethos of dogged fact finding, oversight bodies have tremendous power to support state capacity, strengthen feedback loops, encourage innovation and productive deviation from conventional paths, and empower agency leaders to tear out the pages of the operating manual that don’t serve the mission. Realizing this potential requires changes to oversight practice in six key areas:

1. Engage early

Too often, oversight swoops in well past the point at which it is easy to change problematic policies or programs — so late, in fact, that oversight reports sometimes emerge years after the fact, with the damage already done. While there can be great value in longer-term, linearly structured oversight projects for more established programs, for initiatives just getting off the ground, feedback needs to come more quickly — in some instances, close to real time — to allow agencies to address problems before they metastasize. This requires more agile oversight models than what is currently the norm.

The Pandemic Response Accountability Committee (PRAC) is an exemplar. Congress created PRAC to detect fraud in and abuse of COVID-19 relief funds, and recently reauthorized the committee through 2034. The PRAC prides itself on practicing “agile oversight.” Rather than relying on highly structured oversight models, PRAC uses advanced data analytics to quickly identify and communicate risk areas to stakeholders — building, for example, predictive risk models that flag abnormal and potentially fraudulent applications for pension assistance.

This kind of agile oversight helps ensure that public money is efficiently and effectively put to public use. It’s much harder to recover money already spent than it is to prevent it from being spent poorly in the first place, but a lot of oversight deals just with the former. For example, a March 2026 audit from the HHS IG reported that a contractor with the Centers for Medicare & Medicaid Services (CMS) received over $4 million in overpayments from the Medicare Advantage program eight years ago, between 2018 and 2019. Unsurprisingly, the contractor disagreed with the IG’s recommendation that it refund the money to CMS, which means taxpayers will likely never see that $4 million again. In contrast, the DOJ IG just announced it’s conducting real-time monitoring of how the Federal Bureau of Prisons is spending and planning to spend the $5 billion allocated to the agency in the One Big Beautiful Bill, and has already identified potential risk areas with the funds and made suggestions to address them.

This kind of agile oversight isn’t necessarily constrained to monitoring funds. As GAO and IGs monitor program implementation through data analytics, check-ins, observations, and demos, they should immediately flag problems to the agency as they emerge. A more detailed, retrospective evaluation or audit may follow once things are up and running, but communicating about problems early beats dinging an agency for them past the point of no return.

2. Rethink the recommendation process

As part of their oversight responsibilities, GAO and IGs translate their findings into actionable recommendations that agencies are expected, though not statutorily obligated, to implement. Agencies periodically report back to GAO or the IG on the progress of implementation, and the oversight body assesses whether their actions are responsive to the recommendation. The back and forth continues until the oversight body is satisfied and closes the recommendation. This process has potential to support mission delivery and push agencies to build better processes. A 2014 VA OIG audit, for example, found that 27 percent of calls to the VA’s hotline for homeless veterans went straight to an answering machine, with thousands never receiving follow-up contacts or services from counselors. In response to the OIG’s recommendations, the VA reportedly ditched the answering machine and rearranged staff schedules to ensure adequate coverage during peak call times.

The VA hotline is a straightforward example of oversight making concrete recommendations that are immediately actionable for the agency. But the recommendation process doesn’t always work so cleanly. Too often, recommendations prescribe new procedures or reinforce operating models that don’t work, rather than address root causes. GAO and IGs should make recommendations oriented toward producing desired outcomes, and should ask agencies for what Jen Pahlka and Andrew Greenway call “demonstrations of genuine progress” that show how a solution is working in practice, instead of just describing changes in writing. This could be done in periodic briefings or conversations rather than through the drawn-out exchange of documents.

Agencies and oversight bodies could get on the same page more quickly and work more collaboratively. More ambitious still, oversight could participate in test-and-learn loops by incorporating agency feedback and adjusting recommendations accordingly, while reserving the right to stick to their guns if they judge agencies to be misguided or half-hearted in their commitment to the process. Congress could reinforce this by calling agency heads to testify on their progress in implementing oversight recommendations, and should consider what evidence it wants to see.

3. Employ results-focused oversight models

Much of noninvestigative oversight work takes the form of audits, which must adhere to the detailed, step-by-step procedural requirements laid out in the GAO’s Generally Accepted Government Auditing Standards — aka the Yellow Book. However, IGs are also authorized to evaluate agency programs and inspect agency facilities (including prisons, detention centers, hospitals, and laboratories), and many IG offices maintain dedicated evaluations and inspections (E&I) divisions. The E&I model focuses on assessing the efficiency and effectiveness of agency programs and allows IGs more flexibility in determining project structure and methodology. While audits incline toward assessing compliance with rules and procedures, evaluations and inspections are more often geared toward assessing outcomes and on-the-ground realities.

For example, the language in the titles of these two audit reports the HHS IG released last year evince a focus on controls and compliance:

The titles of these two HHS IG evaluation reports from last year focus on outcomes:

Executive branch IG evaluations and inspections typically follow the Quality Standards for Inspection and Evaluation (the Blue Book), which in its own words is “flexible and not overly prescriptive by design,” containing principles rather than specific procedures. Because the evaluations and inspections model gives practitioners discretion over how to conduct the work and approach problems, it is suited to test-driving an oversight approach oriented to performance and mission-relevant outcomes rather than laser-focused assessments of compliance.

Not all IGs have standalone E&I units, and the units that do exist are generally smaller than their audit counterparts. IGs should continue to invest in and expand E&I work, likely contingent on increased appropriations from Congress. The audit-focused GAO, meanwhile, should, as the Foundation for American Innovation’s Dan Lips and Soren Dayon recommend, consider conducting “results audits” that assess whether a program is fulfilling its congressionally intended purpose.

Traditional audits will always be essential to ensure accountability and identify and mitigate waste, fraud, and abuse in federal programs. And in many cases — especially those involving dollar amounts — it is the right oversight tool for the job. But advancing an oversight culture that holds agencies accountable to their mission performance and not just process compliance requires shifting from a default “audit mindset” toward results-oriented work.

4. Establish tolerance for discretion and deviation appropriate to the operating context

The types of programs and areas in the purview of oversight bodies vary widely by agency — from Veterans Administration hospitals to spacesuits to disaster relief appropriations to the shipping and handling of baby chickens. Given the variation in type and degree of risk across these areas, oversight bodies, including Congress, must be sensitive to agencies’ operating contexts when determining the situations in which it’s acceptable for public servants to exercise their professional discretion and deviate from established procedure. In some operating contexts, deviation from established procedure can be disastrous, even deadly — for example, failing to perform required mental health screenings for veterans. In others, deviation is productive and serves the mission, as with making the FAFSA work.

Borrowing the Navy’s terminology for nuclear submarines, Pahlka and Greenway categorize these scenarios into “back of sub” and “front of sub” situations. In the “back of sub,” where nuclear missiles are stored, leaving things up to individuals’ discretion is inadvisable and potentially disastrous, but in the “front of sub,” where the steering apparatus is located, the pilots’ discretion is necessary to navigate an uncertain and unpredictable environment.

Oversight entities should be making similar determinations among the programs they oversee, and should consider working with agencies to understand when deviation and discretion are reasonable and when they’re not. GAO and IGs could consider working with the agencies they oversee to build a matrix-style framework that assesses risk and corresponding tolerance for deviation by program or program area.

5. Report on the right things

A 2019 report on oversight from the Bipartisan Policy Center (BPC) includes a case study from the VA that illustrates how agencies’ procedures not only can impede mission delivery, but actively work against it:

A patient with a broken foot had driven himself to the hospital. Ten feet from the hospital door, he called the hospital’s main line to request assistance getting into the building and was told: No. He had to call the 911 dispatcher and pay for transport. While the hospital representative who had answered the phone did give a policy-compliant response, the hospital was forsaking the performance of its mission to follow those rules. (p.15)

Failures like this should push oversight bodies, including Congress, to be on the lookout not just for deviation from policy, but for where policy works at cross-purposes with mission. Particularly within the more flexible evaluations and inspections framework, oversight practitioners have a certain degree of latitude over what objectives they focus on and what findings they choose to report, and the BPC report recommends that they spotlight those most relevant for mission performance. Agencies should

choose the things to report that are most relevant to what actions agencies can take to improve an outcome in a mission space. If a metric does not help an agency implement, move forward, or deliver mission, then oversight bodies should not be requiring that metric or they risk actually being an obstacle in an agency’s path. (p.11)

In particular, oversight practitioners should consider two questions when encountering noncompliance with a policy or procedure. First, they should consider the nature of the policy and degree to which it is binding. Is it a statutory requirement, agency policy, guidance from the Office of Personnel Management, or simply a best practice? Not every instance of deviation rises to the level of a finding.

Second, they should consider whether the policies and procedures themselves might be the problem. Oversight bodies have a responsibility to ensure rules are being followed, and they should not automatically excuse deviation on the basis of outcome. But it may be the case that competent and motivated public servants have found a way to better fulfill an agency’s mission, and the rules should change to catch up. Public servants, when possible, should have autonomy to design and implement solutions. When oversight bodies — or even implementers in the agencies themselves — find places where existing procedures do not work, they should be addressed, including by Congress updating law if necessary.

6. Call out successes and best practices

As Pahlka and Greenway note, oversight is generally preoccupied with what goes wrong in agencies and programs, which is inherent in its mandate. Effectively identifying and mitigating waste, fraud, and abuse necessitates some amount of negativity bias. The issue is that “fire alarm oversight,” as Biden administration government performance expert Loren DeJonge Schulman terms it, rarely provides a positive vision for what it would look like for agencies to get it right. It’s important to identify the ways in which government can and does fail, but there’s no rule that says GAO and IGs must only identify failures.

Former DOJ Inspector General Glenn Fine, in his “Seven Principles of Highly Effective Inspectors General,” lists Principle 3 as, “Tell the good with the bad.” Fine writes:

If the agency program is doing well, then we need to say that in our reports with equal prominence to our discussions of problems. … Our credibility depends on us following the facts wherever they lead — not only down a one-way street of negative findings.

Oversight bodies rightly prize their independence and don’t want to be seen as cheerleaders for the agencies they oversee, but proper concern for objectivity should involve noting best practices and success stories when merited. As Pahlka and Greenway note, this includes Congress, which can call up agency leaders in front of committees not to berate them, but to spotlight examples of wise risk taking and bold innovation.

Oversight’s indispensability

Oversight is indispensable to good governance. It saves money, promotes transparency, and ensures accountability. And it’s possible to reap these benefits without discouraging healthy risk taking, overfixating on the negative, or making recommendations that only complicate public servants’ abilities to do their jobs. These problems are bugs, not features, of oversight, which has great potential to close the feedback loop and strengthen state capacity.

6 ways to make oversight work for state capacity

‘Impact evaluation’

1. Engage early

2. Rethink the recommendation process

3. Employ results-focused oversight models

4. Establish tolerance for discretion and deviation appropriate to the operating context

5. Report on the right things

6. Call out successes and best practices

Oversight’s indispensability

More in State Capacity

Classification rules everything around me.

The Legacy IT Trap: What the Technology Modernization Fund reveals about government's modernization problem

The false choice between creating abundance and constraining autocracy

Investigating the investigators: Congressional oversight and state capacity

The hidden cost of simplicity: Procurement reform's fragmentation problem