Trust, state capacity, and the epidemiological mystery of Covid

In February, the British medical journal The Lancet published an article addressing the central epidemiological mystery of COVID-19: Why have many rich countries that looked well-prepared for the pandemic been devastated by it, while some poor countries that seemed bound for disaster have suffered far less than expected? The article drew widespread attention for its emphasis on trust as a clue to the mystery. To put it simply, it found that where people trust one another and their governments, fewer people get sick and die. The paper also looked at the role played by state capacity and the quality of governance in shaping pandemic response, but found only small effects. Those factors received less attention both in the paper itself and in the public response.

This commentary confirms that trust played an important role in the response to the COVID pandemic but it finds a larger role for state capacity. It extends the analysis of the February Lancet article, which will be referred to in what follows as Lancet (1). In doing so, it uses data from another article published in the same journal in March, which will be referred to as Lancet (2). The second Lancet article is devoted to the accuracy of COVID data, especially the underreporting of deaths in many low-income countries. Underreporting was not entirely ignored by Lancet (1), but the problem appears to have been even more serious than previously thought. Finally, the data from both Lancet studies are supplemented with data from my previous work on state capacity. The result is a more complex interpretation of the causal linkage between trust and governance in shaping pandemic performance.

Findings of Lancet (1)

The primary aim of Lancet (1) was to explain international variations in the rates of COVID-19 cases per 1,000 of population (population case rates) and fatalities per 1,000 cases (case fatality rates). The study proceeded in two stages. Stage 1 began with case and fatality rates in each country drawn from other sources. Those data had been adjusted for accuracy to some degree, but the Lancet authors made further adjustments. Particular attention was given to sources of variation that do not depend directly on national policies. For example, differences in seasonal disease patterns, population density, altitude, previous exposure to bat viruses, and GDP per capita were accounted for about 16 percent of variations in population case rates. For case fatality rates, the age-structure of the population accounted for 47 percent of cross-country differences, with factors such as smoking, air pollution, and body mass index explaining another 8 percent. The output of Stage 1 was a set of adjusted case and fatality rates from which these sources of variation had been removed. All references to the Lancet (1) case and fatality rates in this commentary are to the resulting adjusted rates.

Stage 2 of the Lancet (1) study then tested the adjusted case and fatality rates against a number of control variables. The first set of controls consisted of measures of pandemic preparedness and healthcare-system capacity from two major ex-ante studies. Somewhat surprisingly, neither of those was found to have any statistically significant associations with case or fatality rates, a subject that will be revisited later in this commentary.

The second group of controls related to issues of governance, referred to broadly as state capacity, including measures of electoral democracy, government effectiveness, electoral populism, government corruption, and state fragility. The only significant finding from this group was that case rates tended to be lower where governments were less corrupt.

Finally, a third set of controls focused on social indicators, including income inequality (as measured by a Gini index), trust in government, interpersonal trust, and trust in science. Interpersonal trust was measured as the percentage of people responding “Most people can be trusted” to the question “Generally speaking, would you say most people can be trusted, or you can’t be too careful?” Here the Lancet (1) study hit pay dirt. Interpersonal trust was found to have a strong negative association with case rates. Political trust (“Do you have confidence in national government?”) was found to have a smaller but still statistically significant association with case rates. No association with case fatality rates was found for any of the social indicators.

The authors of the Lancet (1) study interpreted their results as consistent with previous research that has found an association between trust and compliance with public-health guidance, from mask-wearing to vaccination. That interpretation drew considerable attention. For example, writing in the New York Times, Ezra Klein lamented that no program of pandemic response will work “if the social context in which those policies play out continues to deteriorate.” He found it no wonder that a distrustful and divided America was hit so hard by the pandemic. CNN’s Fareed Zakaria drew the conclusion that in an emergency, a lack of trust and social capital can cost hundreds of thousands of lives.

Confirmation of the Lancet findings

In the course of previous research, I have assembled a data set that overlaps with the areas explored in the Lancet (1) study, based in large part on indicators provided annually by the Legatum Institute.^[1] My data cover 152 countries that overlap with the 178 countries considered by the Lancet (1) study. This section reports the degree to which the Legatum data and other indicators in my collection confirm the results of Lancet (1). (The main data series used in this commentary can be found in an Online Supplement, abbreviated hereafter as OLS.)

The first step was to examine simple correlations among Lancet (1)’s adjusted population case rates, case fatality rates, and on the one hand, and five other variables that I have used previously: scores for trust in government, interpersonal trust, government integrity (the inverse of corruption), a Gini index, and the natural log of gross national income per capita (GNI).^[2] The population case rate was found to have significant negative simple correlations with interpersonal trust and government integrity and a significant positive correlation with the Gini index (that is, more inequality associated with higher case rates), but no significant correlations with trust in government or GNI.^[3] The case fatality rate — COVID deaths relative to confirmed cases — had no significant correlation with any of the other variables. I also looked at the population fatality rate (COVID deaths per 1,000 of population) implied by the population case rate and case fatality rate. The implied population fatality rate had a significant negative correlation with interpersonal trust and a significant positive correlation with the Gini index.

Multiple regressions using the same five control variables (trust in government, interpersonal trust, government integrity, the Gini index, and GNI) offer a further perspective on the key relationships. As measured by the standard test of multiple correlation (the so-called R²), the five controls were found to jointly explain 32 percent of the variation among countries in population case rates. The individual regression coefficients for interpersonal trust and government integrity were negative and statistically significant, a finding that is consistent with the Lancet (1) findings. The coefficient for GNI was statistically significant and positive in sign, indicating a tendency for the Lancet (1) case rates to be higher in countries with higher per capita incomes. (For detailed results, see OLS Tab 1.)

On the whole, my findings tend to confirm the results of the Lancet (1) study. Countries with higher levels of interpersonal trust and government integrity appear to have weathered the effects of the COVID-19 pandemic significantly more successfully than those with lower levels of trust. My initial results did not confirm the Lancet (1)’s findings of a positive association between the case rate and trust in government, but later sections will return to that variable.

COVID and state capacity

The authors of Lancet (1) by no means ignore the possible effects of variables such as corruption and government effectiveness that are related to the broader concept of state capacity. However, as discussed above, Lancet (1) found those variables to have relatively weak effects. This section extends the discussion of state capacity with additional data and a new structural model.

Other writers, too, have seen strong state capacity as important to successful pandemic response. For example, Brink Lindsey begins a recent essay for the Niskanen Center, “State Capacity: What Is It, How We Lost It, And How To Get It Back,” by noting that “[t]he experience of the COVID-19 pandemic… has challenged the easy association of rich countries with high state capacity… The United States and western Europe failed to contain, much less suppress, the virus with public health measures, while a number of poorer countries in East Asia performed much better.”

Lindsey’s views on the relationship between state capacity and COVID performance are supported by empirical evidence. Numerous measures of state capacity have been proposed. In what follows, I use an index based on indicators of a government’s ability to carry out its priorities (administrative capacity), to protect from crime and maintain peace (security capacity), and manage government finances (fiscal capacity). That index is described in detail in this previous commentary.^[4] Just adding the state capacity index to the table of simple correlations discussed in the previous section does not yield much interest. The Lancet (1)’s adjusted case rate has no statistically significant bilateral association with state capacity or with any of its three components. However, closer inspection suggests some relationships that are not picked up by simple correlations. Consider, for example, the scatterplot of state capacity against Lancet (1)’s adjusted case rate that is shown in Figure 1.^[5] Although there is no statistically significant linear association between the variables, the arched trendline in Figure 1 suggests a nonlinear quadratic relationship worthy of further investigation.

“Suggests” is a keyword here. It is notoriously easy to find some kind of curve to fit a scatterplot of almost any data, just as it is easy for the human eye to see a face in almost any cloud. Any such apparent pattern needs to be carefully evaluated before drawing inferences. Is the relationship in question statistically significant? Is it robust relative to different specifications of the variables in question? And, most importantly, can the statistical pattern be connected to a credible structural model of causal relationships?

The issue of statistical significance is addressed by a multiple regression that uses the adjusted case rate as the response variable and both the state capacity index and the square of the state capacity index as controls. Such a regression explains a statistically significant portion of the variation in case rates among countries. The regression coefficient for the square of state capacity, which is responsible for the arch in the trendline, is negative and significant. The same is true when the Lancet (1) case rate is replaced as the response variable by reported population fatality rates from the Lancet (2) study (to be discussed in more detail below). Removing outliers such as Qatar, Russia, and Iraq (the highest dots in Figure 1) does not erase the statistical significance of the regression, nor does adding GNI as an additional control. (See OLS Tab 2 for details.)

Furthermore, the observed nonlinear pattern is not just a fluke arising from the specific way that state capacity is measured. A similar quadratic trendline is observed for each of the three major components of state capacity (administrative, security, and fiscal). The pattern also holds for several governance indicators, also based on Legatum data, that are not included in the state capacity index, including rule of law, quality of market institutions, personal freedoms, and procedural democracy. Finally, it holds for the alternative state capacity index compiled by Hanson and Sigman. (See footnote 3, above.)

These findings support and extend the conclusion of Lancet (1) that governance matters. They also support Lindsey’s assessment that the United States has little to brag about in its pandemic performance. Overall, the United States ranks 14^th for state capacity. All higher-ranking countries have lower case rates. It is necessary to go all the way down the state capacity rankings to Qatar, ranked 28th, to find a country with a higher case rate.

But far from resolving the epidemiological mystery of COVID-19, these findings only replace the original mystery with a new one: If state capacity and other aspects of governance support successful pandemic response, as Lindsey and others maintain, case rates and fatality rates would be expected to fall monotonically with higher capacity. Instead, starting with the lowest-capacity countries, case and fatality rates at first increase with state capacity. The expected inverse relationship holds only among countries that already have above-average capacity. The next section reports on the search for a credible causal model that is consistent with the statistical results reported here.

Unraveling the epidemiological mystery

One credible explanation for the new version of the epidemiological mystery lies in the interactions among two types of factors that are linked both to state capacity and to pandemic performance. Some of those factors facilitate pandemic response as countries’ state capacity becomes stronger. These can be called facilitative factors. In contrast, other factors, are associated with lower actual or apparent case or fatality rates in countries with weaker state capacity. These will be called contravening factors since they act in a way that is contrary to the widespread intuition that countries with greater state capacity should be better able to deal with pandemics.

Figure 2 uses a hypothetical numerical example to show how such a two-factor model could work.

Part a of Figure 2 shows a facilitative factor that causes case or fatality rates to decrease as state capacity increases and a contravening factor that causes observed or actual rates to increase as state capacity increases. Overall case or fatality rates are modeled as the sum of the two factors plus an interactive effect equal to their product.^[6] Figure 2b shows modeled case rates for 100 hypothetical countries when random errors are added and the values of the variables are standardized. The result is a chart that, while based on purely hypothetical data, bears a striking resemblance to the Lancet (1) data on population case rates shown in Figure 1. (See OLS Tab 3 for details of the example.)

If we are to apply this model to COVID pandemic outcomes, the next question is just what factors should play facilitative and contravening roles. The most obvious facilitative candidate is the ability of higher-capacity governments to provide broader health care coverage, better hospitals, more practitioners, and so on. The Legatum dataset contains a compound variable called care systems that measures all of these things, and more. The correlation coefficient between the state capacity index and the care systems variable is a robust 0.86. Another Legatum indicator for preventive public health measures is almost as strongly associated with state capacity.

The contravening factors—those that tend to raise observed case and death rates as state capacity increases—are perhaps not quite so obvious, but several candidates can be found in the literature on response to the COVID pandemic.^[7] Three of them are the accuracy of reporting of cases and fatalities, age-structure of the population, and prior immunity acquired through mechanisms other than exposure to or vaccination against the virus that causes COVID-19.

Accuracy of reporting. The accuracy of reporting of cases and fatalities is one likely contravening factor that can cause COVID rates to appear low in countries with low state capacity. State capacity is positively and significantly associated with accuracy of reporting. That relationship is likely causal, not only because low-capacity states have weak administrative institutions but also because of security issues. In many of the lowest-capacity cases, the central government does not even have effective control over the country’s entire territory.

The data in the Lancet (1) study are reported to have undergone some preliminary adjustment for underreporting, but the Lancet (2) study, published in March, suggests that inaccurate reporting is even more pervasive than previously thought. Unlike Lancet (1), which gives more attention to case rates, Lancet (2) focuses on population fatality rates, but it seems plausible that where fatalities are underreported, cases will be, too.

To measure the accuracy of reporting, Lancet (2) begins by comparing the number of total fatalities during the pandemic with the number that would have occurred in normal times. The difference between the two is termed “excess fatalities.” Credible data on total fatalities are available in some countries. Where they are not, Lancet (2) uses a statistical model to estimate the number of total and excess fatalities. In this study, I use the percentage of estimated excess fatalities that are officially attributed to COVID-19 as a measure of the accuracy of a country’s reporting.

Population age structure. Differences in the age-structure of populations are another likely candidate for a contravening factor. As a simple metric of age structure, I use the old-age dependency ratio, which is the ratio of population aged 65 and older to the population aged from 16 to 64. Since fatalities per case are known to increase substantially with age, we would expect a positive association between the old-age dependency ratio and the population death rate by country. As mentioned earlier, Lancet (1) found that differences in age-structure accounted for 47 percent of the variation across countries in the case fatality rate. Other things being equal, that would imply a similar impact on population fatality rates.

There is a positive association between old-age dependency and state capacity – a simple correlation of 0.71 for the data used in this study. That positive association is not just a coincidence. State capacity is strongly correlated to per capita income (subject to causal relationships that are complex and probably run both ways). Low incomes, in turn, are associated with high birth and death rates that in combination produce low old-age dependency ratios. The positive association of old-age dependency with both fatality rates and state capacity qualifies it as a contravening factor.

Prior immunity. Prior immunity acquired through some pathway other than vaccination or previous exposure to the SARS-CoV-2 virus is a third likely contravening factor. Discussions of unexpectedly low case rates in sub-Saharan Africa have given that possibility considerable attention, and it probably operates in other regions, as well. One paper lists five factors that may play a role in creating resistance to infection: prior vaccinations, including BCG vaccination for tuberculosis (rarely used in the United States today); “training” of the immune system by microorganisms that normally live harmlessly in the human body; prevalence of infectious diseases other than COVID-19; use of herbal plants and natural remedies; and genetic factors related to immunity that have been found to be more prevalent in African populations than elsewhere.

I did not find any simple overall measure of prior immunity to use for statistical analysis. The closest I could come were a Legatum measure of the burden of infectious diseases and tuberculosis rates from WHO. Better control of infectious diseases in general and TB, in particular, is positively associated with state capacity. Where control of both is weak, reported COVID death rates tend to be lower. The correlations are statistically significant and have the signs expected for contravening indicators. However, the effects of these variables are not especially strong. Control of all infectious diseases (which presumably includes TB) shows somewhat stronger effects than control of TB alone. Very likely, further studies will devise better statistical measures of the prior-immunity effect and lead to a better understanding of its role in the COVID pandemic. At present, the importance of prior immunity effect remains more controversial than that of underreporting and population age-structure.

Results. Just one facilitative factor and one countervailing factor are required for a statistical test of the two-factor model. Accordingly, I combined the facilitative and countervailing indicators discussed above into compound indicators. For the facilitative factor, I used an average of the Legatum indicators for care systems and preventive measures, each of which is already a compound of several individual indicators of health system capacity. For the countervailing factor, I used an average of reporting accuracy, old-age dependency, and control of infectious disease.

I then ran a multiple regression using the reported population fatality rate from Lancet (2) as the response variable. As controls, I used the compound versions of the facilitative and contravening factors, along with the product of the two compound factors as the interactive term. The results are consistent with the two-factor model described earlier (see OLS Tab 4 Regression 1):

The coefficient of multiple correlation is 0.63, indicating that the facilitative and contravening factors, together with the interactive term, account for just about 40 percent of the variation among countries in reported population fatality rates. That result is statistically significant. The regression coefficients for each of the three control variables are also statistically significant.^[8]
The regression coefficient for the facilitative factor, which measures the quality of a country’s health care and public health systems, has a negative value. Other things being equal, a stronger health care system is associated with a lower population fatality rate. That is particularly noteworthy in view of the fact that the simple bilateral correlation between health care system quality and fatality rates is, counterintuitively, positive. The observed relationship is reversed to the expected negative sign only when appropriate controls for contravening factors are included in the analysis.
The regression coefficient for the compound contravening variable is positive. High accuracy of reporting, high old-age dependency, and strong control of infectious diseases (which implies low rates of prior immunity) are jointly associated with higher reported death rates. Because the compound contravening variable is, in turn, positively associated with state capacity, we get the “epidemiological mystery” that the impact of the pandemic looks surprisingly moderate in many poor, low-capacity countries. In part, the apparently mild impact of the pandemic is an illusion caused by severe underreporting in low-capacity countries. However, to the extent that low-capacity countries tend to have younger populations and wider prevalence of prior immunity, their low reported death rates are, at least in part, a real thing.
The coefficient on the interactive factor is negative and strongly significant. That negative coefficient, together with the strong correlations of both facilitative and contravening factors with state capacity, accounts for the arched pattern of the relationship between reported death rates and state capacity, which is found also for various other indicators of quality of governance.

These results pertain to reported population death rates, as studied by Lancet (2). It is worth asking whether the same model explains the arched shape of the relationship between the Lancet (1) population case rates and state capacity, as shown in Figure 1. The answer is a somewhat qualified “yes.” (See OLS Tab 4 Regression 2 for details.)

Running the same regression with the Lancet (1) case rate produces a significant but weaker multiple correlation of 0.49, meaning that the model accounts for only 24 percent of the observed variation in case rates.
The individual regression coefficients for the facilitative and contravening variables have the expected signs and are significant at a 0.1, but not a 0.01 level.
The coefficient on the interactive term is negative and significant at the 0.01 level. Regardless of the coefficients on the individual factors, the negative interactive coefficient is sufficient to produce the observed arched shape of the line in Figure 1.

Possibly these weaker results can be attributed to the fact that the Lancet (1) case rates are already partially, although not completely, adjusted for some of the contravening variables, such as reporting accuracy and some aspects of prior immunity.

What role for trust?

In conclusion, it is worth returning to the role played by individual and political trust in pandemic response, the finding that attracted so much attention when the Lancet (1) study first appeared. The simplest way to test the role of individual and political trust is to use those two variables as additional controls in the multiple regression for reported death rates. Doing so raises the coefficient of multiple correlation from 0.63 to 0.73. The share of variation explained by the regression thus rises from 40 percent to 51 percent. The coefficients on both of the trust coefficients are negative and statistically significant. It appears, then, that other things being equal, both individual trust and trust in government are associated with lower COVID fatality rates. (See OLS Tab 4, Regression 3.)

Note that this last regression restores a significant role for trust in government as reported in Lancet (1). That role was not evident earlier in this commentary, where simple bilateral correlations were used in replicating the Lancet (1) findings. However, when controls are applied for the effects of facilitating and contravening factors, the significance of trust in government reappears.

On balance, it seems likely that interpersonal trust and trust in government have two kinds of favorable effects on COVID performance. The first are direct effects of the type hypothesized by Lancet (1). Those would reflect, for example, a tendency for people to be more likely to follow public health advice that came from a trusted government or to get vaccinated if they trust their neighbors to do the same. The second kind of effects are indirect, with greater trust contributing to higher state capacity, and state capacity, in turn, facilitating COVID response via better care and public health systems, with the latter effect masked by the contravening effects of state capacity in some low-income countries.

The complementarity of both kinds of trust with state capacity calls to mind a central theme of Kevin Vallier’s insightful book, Trust in a Polarized Age. Vallier sees trust and good government as united in a virtuous circle: Trust creates conditions in which good government flourishes, and good government, in turn, reinforces trusting and trustworthy behavior on both the interpersonal and the political level.

In short, there is a lot of food for thought in the Lancet (1) study and beyond. Trust does matter, perhaps even more than Lancet (1) suggests. At the same time, good government, represented in this commentary by an index of state capacity, matters even more than the Lancet (1) study found. Trust is not a variable that can be directly manipulated as an instrument of public policy, but over time, piecemeal improvements in policies and institutions can accumulate and build greater trust both among citizens and between citizens and government. Clearly, when a pandemic strikes, trust and good government are both important assets.

Photo Credit: iStock

^[1]The Legatum scores for interpersonal trust and trust in government appear to be drawn from the same surveys as used by the Lancet study. The Legatum government integrity index is conceptually similar to the government corruption index used by Lancet, but the sign is reversed and the data sources are different. Also, I use a different Gini index, based on a study by Laurence Chandy and Brina Seidel that was published by the Brookings institute. Their version of the index attempts to correct for underreporting of top incomes, and produces higher Ginis for many countries, including the United States.

^[2]Stage 1 of the Lancet study adjusts case and fatality data for differences in GDP per capita. In my past research, I have found gross national income (GNI) to consistently produce better results than GDP. Furthermore, I have found that many potential response variables are more closely associated with the natural log form of GNI or GDP than with the linear form that appears to have been used in the Lancet study. In this commentary, all references to GNI are to the natural log form.

^[3]Unless otherwise noted, “significant” or “statistically significant” is used to mean significant at a 0.01 level of confidence.

^[4]Jonathan Hanson and Rachel Sigman offer a good review of the literature on the measurement of state capacity, along with an index of their own design. My own state capacity index correlates strongly with the Hanson-Sigman index (R = 0.92).

^[5]The chart uses standardized values of the variables that are transformed to give them a mean of 0 and a standard deviation of 1. For a normally distributed random variable, approximately 68 percent of observations will fall within 1 standard deviation of the mean and approximately 95 percent within two standard deviations.

^[6]Using FF for the facilitative factor and CF for the contravening factor, the two-factor model can be represented as RATE = a₀ + a₁FF + a₂ CF + a₃(FF X CF) + e, where e is a random error.

^[7]Stephanie Nolen provides a good overview in a New York Times article on the paradoxically low COVID death rates reported in many African countries.

^[8]Comparing OLS Tab 4 Regression 1 with Tab 2 Regression 3 shows that the R² is much higher for the two-factor model than when state capacity and its square are used as the controls. That is to be expected, since the facilitating and contravening factors impact reported fatality rates directly, whereas state capacity impacts fatality rates only indirectly, via its association with the facilitating and contravening factors.