Commentary
New Race Concordance Study Should Leave Readers Skepticald
Share:
Black Representation in the Primary Care Physician Workforce and Its Association with Population Life Expectancy and Mortality Rates in the US was recently published in JAMA Network Open. The researchers examine whether “greater Black PCP (primary care provider) workforce representation is associated with better population health measures for Black individuals.” They claim to observe that greater representation is associated with longer African American life expectancies, lower all-cause mortality, and lower disparities with White mortality rates.
The paper received fawning media coverage and generated buzz across the health policy world. But closer inspection reveals that results don’t justify the hype.
The researchers use county-level data from 2009-2019 to examine how various county-level characteristics correlate with black mortality patterns. They are principally interested in correlation with the “community representativeness ratio,” defined as the proportion of black PCPs in a county divided by the proportion of Black individuals in a county.
The researchers observe that a higher community representativeness ratio is correlated with modest improvements in black life expectancy and lower disparities with White mortality after holding constant other county-level variables such as obesity rates, home value, and air pollution. They conclude that “Black representation levels likely have relevance for population health, supporting the need to expand the structural diversity of the health workforce.”
The empirical strategy used to arrive at this conclusion ought to raise eyebrows. The “community representativeness ratio” provides no information about the proportion of African American patients treated by African American PCPs. Plus, PCPs are a single touchpoint within the healthcare system. The PCP representativeness ratio amounts to a very noisy measure of patient-provider racial concordance.
The bigger problem, as the adage goes, is that correlation is not causation. This remains true even with fancy models that observe correlations between multiple variables at one time. In fact, if one accepts at face value that these correlations represent causal estimates, then their analysis would indicate that relocation to rural counties with high proportions of male residents represents among the best solutions for improving African American life expectancy. Moreover, African Americans who live in counties with a high representativeness ratio and high levels of poverty can expect to live longer than African Americans in counties with a high representativeness ratio but low poverty. These observations aren’t just glossed over in the paper because they are politically unfashionable, but because they are plainly absurd and expose the practical limitations of correlational studies.
Of course, not everything can be evaluated through a randomized control trial, and some research questions are functionally limited to descriptive evaluation. Still, researchers can increase confidence that they have uncovered plausibly causal relationships by demonstrating that a significant association between two variables is not sensitive to judgements about how to model their relationship. For example, the researchers could have demonstrated that their results held if they omitted measures of obesity and air pollution but included measures of violent crime or traffic accidents. The failure to demonstrate any sensitivity whatsoever to judgements about model specification raises the specter that the researchers tested a limitless number of potential model permutations and then conveniently settled on one that produced desired results, a phenomenon known as p-hacking.
Suspicion of self-serving decisions about model specification looms especially large surrounding their decision to transform representativeness ratios with logarithmic functions, a statistical technique- and a judgement call- that forces them to exclude from their analysis the approximately 50% of counties that their data indicates did not have any black PCPs. How results might change without the logarithmic transformation and with a complete dataset is never addressed.
It’s possible that that their observations prove robust to changes in model specification. The decision not to provide any clues as to the sensitivity of results to model specification will leave readers guessing and ought to leave them skeptical.
Ian Kingsbury is the Director of Research for Do No Harm.