The Maternal Quality Landscape–Part Three, Segment Four: How do we measure AND achieve it?

[Editor's note:  Continuing with Christine Morton and Kathleen Pine's review of U.S. maternal quality care measures assessment this week, and in completion of their three-part series, today they discuss methods of data collection and the problems that sometimes occur in accurate documentation.]

Reporting the Measure
The <39 Weeks measure is a good example of why accuracy in data collection and reporting of measures is important. The Leapfrog Group (a patient safety group that conducts self-selected patient safety and quality surveys with participant hospitals and makes the results public) adopted the measure after its NQF endorsement, and incorporated it into the 2010 Leapfrog survey. When the results of the measure were made public, some hospitals had extremely high ED <39 weeks rates and some had extremely low rates. Such wide variation can indicate true differences in incidence of a procedure, or it can reflect challenges in measurement.  Quality advocates pay close attention to how a measure is calculated, because if the data is challenged as inaccurate, hospitals will not acknowledge they have a quality improvement issue.  In this case, at least some of the variation seen in the Leapfrog data may have been due to hospitals not reporting just those elective deliveries within the specified time frame. Correct measurement is crucial not just to improve quality but to the quality improvement endeavor as a whole.  Hospitals and providers must understand how a measure is correctly executed and have the time and resources to prepare data.  As hospitals and initiatives move forward on this issue, specifications for this measure have been refined.  In the 2011 Leapfrog survey, the measurement specifications were adapted to match those of TJC.  It will be interesting to compare the results in the next survey with those reported in 2010.

Obtaining accurate data
In order to trust that the information being reported by a measure reflects the actual practices in a hospital and their outcomes, the data that the quality measures are built on must be accurate.  <39 weeks presents several potential problems with data accuracy, chief among them:

1) Gestational age.  Although ACOG provides criteria for confirming gestational age (ACOG, 2009), it can be difficult to gauge gestational age effectively, and the further a pregnancy progresses, the more difficult it is. There are two issues: the accuracy of gestational age and consistency in using a particular method to assess it.  Women may not know when their last menstrual period before pregnancy was, and menstrual cycles vary in length.  Ultrasound used in early pregnancy provides a more accurate estimate, but some women do not seek early prenatal care or receive a first trimester ultrasound.  The medical record may indicate gestational age as calculated by last menstrual period, by ultrasound or some other means.   In addition, hospitals vary in terms of which department and what level of staff are assigned to fill in the data required by the birth certificate.  In some cases, birth clerks are assigned this task and may not receive adequate training to ensure they select the most accurate gestational age, if there is more than one estimate in various places throughout the chart.

2) Documentation.  Accurate and complete documentation of the data elements required to make the measurement is crucial.  If something is charted wrong at the bedside, it may be impossible to catch the error in later calculations.  Good documentation practice often requires extensive education of providers from quality analysts and educators.  <39 weeks, for instance, requires providers to accurately record whether a patient was induced, and this becomes an ICD-9 procedure code.  A common mistake in documentation on the part of providers is to note that a patient was augmented with Pitocin when they were actually induced or vice versa.  Definitions of induction can be confusing, it may be difficult to determine whether or not labor started on its own, and those collecting the data often must do extensive “detective work” when one piece of information does not match up with another to create a clear picture of what happened.  The chart review component of this measure can be time consuming.

3) Sampling issues. TJC specifications allow for hospitals use sampling methods to select a random subset of births to calculate the measure. The problem with this is that hospitals with small numbers of births may select a random sample of cases in which there are few elective deliveries < 39 weeks, thus under-reporting the issue.  If instead, obstetric departments work with their medical records or quality department and screen cases (less the excluded ICD-9 codes) for the desired time period, they then use the delivery logbook (electronic or paper) to identify all births occurring between 37-39 weeks.  Those births coded with a cesarean or induction will need to undergo a chart review to ascertain whether the woman had rupture of membranes or was in labor to exclude those cases.  Sampling seems simpler, but has the potential to be the victim of the law of small numbers, leaving hospitals with nothing to report but not necessarily accurate.  Doing chart review can be time-consuming – for a hospital with about 100 births a month, this simplified approach would result in about 8-10 births needing a chart review.  At an estimated 15-20 minutes per chart review, this entails 2-3 hours per month to collect the data for the <39 weeks measure.

4) Redefining the issue.  It may be that by adopting a hard stop policy, hospitals will be successful in reducing early inductions.  However, rather than charting the intervention as an ‘induction,’ hospital staff may instead chart the intervention as an ‘augmentation,’ with a concomitant rise in augmentations.  It is important for quality measure advocates to develop mechanisms to ensure that focused attention on reducing one practice do not result in increasing the incidence of another, related practice.  It also means that a set of ‘balancing’ measures can be helpful to avoid certain processes/outcomes being relabeled.

Posted by:  Chritine Morton PhD and Kathleem Pine, University of California, Irvine

    Dear Christine and Kathleen,
    Thank you for this thought provoking and thorough series of articles. Your tremendous scholarship and efforts are truly appreciated.

