A Case of Statistical Malpractice? Predicting the Risk of Uterine Rupture
‘Tis the season for the Society for Maternal-Fetal Medicine to publish the abstracts for their forthcoming annual meeting. Every year around this time I receive the gift of an electronic Table of Contents alert for the Supplement to the American Journal of Obstetrics and Gynecology that lists conference sessions. MFM doctors do interesting research, and their conference, which I have never attended, always has several sessions that look fantastic along with others that make me cringe (like a recent year’s session plugging this “exciting innovation“).
Nestled among the 800+ abstracts was one that I would put in the cringeworthy column, not for the focus of the research but for the complete mismatch between the reported findings and the researchers’ conclusions. [Emphasis mine]
Frequent epidural dosing is a marker for impeding uterine rupture in patients attempting vaginal birth after cesarean (VBAC)
Alison Cahill, Anthony Odibo, Jenifer Allsworth and George Macones
Washington University in St. Louis, St. Louis, Missouri
To estimate the association between epidural dosing and risk of uterine rupture in women attempting VBAC.
A nested case-control study within a multicenter retrospective cohort of >25, 000 women with a prior cesarean was performed, comparing cases of uterine rupture to women without rupture (controls) while attempting VBAC. Extensive data extraction included all medications in 15-minute increments. In women who attempted VBAC with an epidural anesthetic, dose timing, frequency, and quantity were compared between cases and controls. Time-to-event analyses were performed to estimate the association between epidural dosing and risk for uterine rupture while accounting for duration of labor and confounding effects.
Of 804 women in the nested case-control study; 504 (62.7%) had an epidural, with no statistical difference in epidural usage rates between cases and controls (70.4% v. 62.4%, p=0.09). Women who experienced uterine rupture were > 4 times more likely to require epidural dosing in the 60 minutes prior to delivery (aOR 4.1, 2.4 – 6.7, p <0.01). Cox-regression analysis revealed a dose-response relationship between number of doses in the final 90 minutes of labor and risk of rupture, after adjusting for prior vaginal delivery, and oxytocin exposure.
Clinical suspicion for uterine rupture should be high in women requiring frequent epidural dosing during a VBAC trial.
What’s the problem here? This is a classic example of reporting the “hazard ratio” (e.g., “4 times more likely”) in lieu of the more appropriate statistics, which in this case would be the “positive predictive value”. It is indeed noteworthy that women destined to experience uterine ruptures self-administer more anesthesia in the minutes prior to the event, but should “clinical suspicion be high” every time a woman in a VBAC labor pushes the epidural button frequently? At least from the data reported in the abstract, the answer is: we have no idea.
To get an answer we need much more data. Specifically, we need to know:
- how many women pushed the epidural button frequently
- how many of them had a uterine scar rupture
- how many women did not push the button frequently
- how many of them had a uterine scar rupture
These data would help us calculate the sensitivity and specificity of epidural dosing in predicting uterine scar rupture, which in turn tell us the likelihood of a “false positive” (a woman requests frequent doses of epidural but does not have a scar rupture) and a “false negative” (a woman doesn’t request frequent epidural dosing but does have a scar rupture).
Sensitivity and specificity are especially important in predicting something that occurs rarely, such as uterine scar rupture in a VBAC labor. Reporting that something is “4 times more likely” could still be a small risk in absolute terms, if the baseline risk is low. In the case of VBAC, this kind of reporting could in fact be hazardous, because it is likely that many women and even many obstetricians overestimate the baseline risk of uterine scar rupture and of rupture-related morbidity and mortality. So quadrupling it would falsely elevate risks even further. Let’s take for example statistics put forth by a spokesperson for the American College of Obstetricians and Gynecologists. In a letter to a mother who appealed to the College to make VBAC more accessible, he notoriously overestimated the risks.
In two percent of [VBAC labors] the result can be a rupture of the old scar. If this happens, then death of the baby is almost certain and death of the mother is probable. Even if the mother does not die, virtually 100% will lose their child bearing ability.
In this scenario, anything associated with a 4-fold increase in uterine rupture would result in 6 additional babies dying plus 6 additional mothers dying or needing hysterectomies for every 100 VBAC labors. Looking at these data, it’s easy to justify doing a cesarean when the woman begins asking for epidural top-ups even if top-up requests have a low predictive value.
But the uterine scar rupture rate is in fact 0.5-1%, and in only about 5% of ruptures is the baby likely to die. Maternal mortality is rarer still, and the likelihood of either maternal mortality or hysterectomy is actually higher with repeat cesarean surgery than it is with planned VBAC. Quadrupling these risks might result in 15 excess fetal/newborn deaths per 10,000 VBAC labors. This may still seem to be an unacceptable risk, but it’s nothing close to 6 per 100. In this scenario, it’s a little more difficult to justify going straight to a cesarean for every woman requesting more anesthesia.
I’ll give the researchers the benefit of the doubt. It is clear that they understand the distinction between relative risk and predictive value, since they’ve published papers on the topic before that appropriately concluded that obstetric variables poorly predict the likelihood of scar rupture. They may also have been severely limited by journal space constraints in preparing their abstract for publication. But I’ll call “statistical malpractice” on them for publishing a conclusion that suggests that the predictive value is high without providing any data to support it.
FYI, this topic should be familiar to anyone listening to the news lately, as false positive are at the crux of the debate about the new mammography guidelines. The New York Times ran a piece explaining concepts of risk and predictive value just last week, with the decidedly unsexy title, Mammogram Math. It’s a great read for anyone who wants to know more about interpreting statistics about risk.