Junkfood Science: Can we trust gold stars to help us choose the best?

August 01, 2007

Can we trust gold stars to help us choose the best?

Do those insurance company and government ratings of our doctor and local hospitals reflect the quality of care we receive or simply how well they’re complying with cost containment and money-making mandates? Several recent articles in the news and medical literature give us cause to consider carefully what we hear and read about our healthcare providers.

The Washington Post revealed that UnitedHealthcare has been profiling doctors in the Washington area for the past year and a half and recently launched a website rating doctors with gold stars “to encourage physicians to refer patients to two-star doctors and for patients to seek out two-star physicians.” United is following in the footsteps of more than a hundred other insurers across the country:

Doctors Rated but Can’t Get a Second Opinion

After 26 years of a successful medical practice, Alan Berkenwald took for granted that he had a good reputation. But last month he was told he didn't measure up — by a new computerized rating system. A patient said an insurance company had added $10 to the cost of seeing Berkenwald instead of other physicians... because the system had demoted him to its Tier 2 for quality.... In the quest to control spiraling costs, insurance companies and employers are looking more closely than ever at how physicians perform, using computers, mountains of health claims and billing data and sophisticated software.

Such data-driven surveillance offers the prospect of using incentives to steer patients to care that is both effective and sensibly priced. It also raises questions about the line between responsible oversight and outright meddling in the relationship between caregivers and their patients. And it shows how people such as Berkenwald are at risk of losing control of their reputations as corporations and other organizations mine electronic data to draw conclusions about them and post them online....

The effort is more about cutting costs than raising quality, some say, adding that doctors could begin to “cherry pick" healthier patients whose problems are less costly to treat. Such systems fail to capture the intangibles of quality, such as a doctor who visits a dying patient at home, critics say.

What is enabling insurers to profile how doctors practice medicine is the growing use of electronic medical records. Now, data from doctors’ charts and billing records, pharmacies, labs, hospitals and patients’ health and lifestyles are being aggregated in giant digital storehouses. According to the Washington Post, six health plans in Massachusetts have pooled their data on 120 million claims to assess doctors’ performances.

This still pales in comparison to the large national insurers, as reported here last year. Blue Cross Blue Shield Association, for instance, has Blue Health Intelligence, an electronic database which includes all medical records and claims, prescriptions, lab results and other care on 79 million enrollees from 20 BCBS insurers in 34 states.

The grading curve

But what measures are insurers using to rate our doctors? According to the Washington Post, insurers say they rate doctors based on standards of quality of care and cost efficiency:

An internist, for example, gets higher ratings... if diabetic patients are tested for blood-sugar control. Analysts assess cost efficiency by looking at factors such as how many and what types of exams were conducted... Was a generic or brand-name pain medication prescribed? Doctors are then rated against peers in the same community, by type of patient and illness, and against clinical performance guidelines created by specialists such as the American Heart Association.

In other words, they’re using the same performance measures being implemented by healthcare management companies to determine the compensation doctors receive — called “pay-for-performance.” As reviewed here, P4P measures reward and penalize doctors based on if they’ve ordered the screening tests and labwork, written the prescriptions, and if their patients have complied with the practice guidelines, that insurers demand. Insurers want us to believe that they can measure good medical care and our health by managing processes and numbers....the big problem: it’s not true.

The conventional measures of healthful activities and numbers have proven time and again to not be good ways to determine actual risks for disease or premature death. This is also where the soundness of clinical guidelines become so important — such as the clinical management of weight, childhood obesity, drug and lifestyle interventions in the prevention of heart disease, drug treatment of lipids in children, management of blood pressure, and others being required of healthcare providers.

These are behind the same measures insurers are imposing on us through employers, with such things as compulsory participation in health risk assessments, screenings and wellness management programs. Not only is the evidence for the benefits and effectiveness of these initiatives debatable, but their actual cost savings aren’t a slam dunk, either. No matter. Growing numbers of us are finding ourselves surrounded by pressures to conform. The incentives started out innocently with free gym bags and water bottles. Now, employees who are fat, have “high” cholesterol or blood pressure, smoke or fail to participate in health management are being hit by higher insurance premiums or having trouble getting insurance coverage at all.

Doctors are increasingly under the gun to adhere to insurers’ guidelines, too. Their incentives started with pay-for-performance initiatives but now mean public reporting of how well they’re measuring up. Not only are the clinical guidelines controversial, as the Washington Post learned, doctors are finding their professional reputations and livelihoods jeopardized by low rankings when they fail to concede, even when complying means nonsensical things like ordering mammograms on women who’ve had double mastectomies. Doctors are also graded on how well their patients do what they say:

Doctors critical of ratings systems say they are held accountable for whether patients exercise, take their medications or follow their prescribed regimens. Berkenwald, the Massachusetts internist, said he was pushed from Health New England's top 10 percent of physicians into its second tier because several of his female patients did not get the mammograms or Pap smears he prescribed.

There is a thin line between responsible oversight and bureaucrats trying to practicing medicine by stepping between patients and their doctors. Research, led by Lawrence Casalino, assistant professor of health studies at the University of Chicago, published in the March/April, 2007 issue of Health Affairs, examined how internists viewed P4P and the public grading of their performance. Seventy percent of the doctors who participated in the national randomized survey said that quality measures weren’t accurate, with 88% saying they felt the measures don’t properly adjust for patient’s medical conditions and for the needs of the poor, and could have adverse “unintended consequences on disparities in health care delivery, on physicians who practice in areas of low socioeconomic status, and on the quality of care in important areas of physician practice that are not included in the program being evaluated.” In other words, ordering tests and prescriptions may get doctors good grades, but may not necessarily be what’s best for the patients’ needs.

Government profiling us, too

It’s not just private insurers who are profiling our doctors using these same measures. The Gaylord Herald Times in Gaylord, Michigan, reported last week that the U.S. General Accounting Office (GAO) has recommended “physician profiling” for the Centers for Medicare and Medicaid (CMS) to bring all doctors up to the level of “Dr. Efficient” to cut costs:

Under the GAO plan, physician efficiency would be based on how many office visits, hospitalizations and tests are appropriate for a particular type of patient, to draw comparisons between efficient and inefficient doctors. General practitioners would be the first target. According to reports, CMS has the ability to develop these standards based on Medicare claims. This purportedly translates into being able to identify inefficient doctors from the efficient....

Some say the “science” of profiling would require good risk adjustment mechanisms, and they have not been developed yet. The risk is that doctors might shy away from patients with complicated medical issues, or just drop Medicare patients totally, as a number already have done.

This April, the CMS launched a new P4P program — the Physician Quality Reporting Initiative — which shifts emphasis from prevention to containing costs relating to chronic diseases of aging. It will pay doctors bonuses who submit detailed reports on compliance with 74 clinical practice measures. The PQRI quality measures include documented antidepressant use for 12 weeks after a new episode of major depression; patients 50 years and older on medications for diagnoses of osteoporosis; hemoglobin A1c and LDL-cholesterol control in diabetics. All tolled, the quality measures encourage some 15 different prescriptions for adults. Doctors can earn a bonus of up to 1.5% of total allowed charges for Medicare physician fees through the end of the year, but the senior advisor and medical officer for the CMS said last December that compliance “could take as much as 20% to motivate physicians.”

What really counts

The big question, which should be the only one that matters, is whether any of these performance measures — and the efforts to get caregivers to comply — actually result in better care and outcomes for patients?

Quality of care measures are difficult to develop and evaluate. But a careful review find that most metrics judge processes in managing care practice, rather than actual clinical outcomes. They are not the same.

Two new recent large studies, in fact, both found that expensive P4P measures and initiatives to improve caregiver behaviors did nothing to reduce mortality or improve outcomes for heart patients.

The first, published in the Journal of the American Medical Association, set out to learn if P4P programs actually improve patient outcomes. They performed a 3-year analysis of the CMS Hospital Quality Incentive Demonstration project — billed as the largest federally sponsored P4P program to date in the United States. These were our tax dollars at work. Bonuses just for the first two years of this incentive program totaled $17.5 million, going to 123 hospitals the first year and 115 the second year.

Did you hear about their findings? Where they splashed across the media? They should have been, but the most helpful science rarely is.

Researchers with CRUSADE, a National Quality Improvement Initiative of the Duke Clinical Research Institute, examined the clinical outcomes of 10,325 heart attack patients (with acute non-ST segment elevation) at participating hospitals and compared them to 95,058 well-matched (in hospital size, academic affiliation, region, type of facility and specialized services available; and patient ages, BMI, ethnic/racial diversity, insurance, medical histories and problems, family histories, and presenting heart attack symptoms) controls. They found small improvements in two of the performance measures at the participating hospitals compared to controls, but there was no significant difference in patient outcomes. “Financial incentives had little incremental effect,” they concluded:

We did not find an association between P4P and mortality among hospitals participating...our 3-year observation period is long compared with most studies of quality improvement interventions, and the potential novelty of pay-for-performance among voluntary vanguard centers may have been anticipated to have had its greatest impact during this initial period....

The results of this study raise concerns about what magnitude of effect pay-for-performance programs should have to justify the administrative burden and potential unintended consequences of financial incentives.

As the chairman of cardiovascular medicine at the Cleveland Clinic, Dr. Steven E. Nissen, told the Wall Street Journal, on June 6th, these results “suggest we ought to slow down a minute before going into pay-for-performance.”

The second study, published last week in the Archives of Internal Medicine, examined data on 48,612 patients with heart failure from 259 hospitals participating in the Organized Program to Initiate Lifesaving Treatment in Hospitalized Patients With Heart Failure (OPTIMIZE-HF). “OPTIMIZE-HF is the largest national hospital-based program dedicated to quality-of-care improvement for patients hospitalized with HF in the United States,” said the authors. Utilizing electronic medical records, they evaluated the effectiveness of nine performance measures: 4 JCAH (Joint Commission on Accreditation of Healthcare Organizations) performance measures (discharge instructions on weight management, activity and medication use, left ventricular function test, prescriptions for an angiotensin-converting enzyme inhibitor (ACEI), and smoking cessation counseling); plus the 5 OPTIMIZE-HF quality measures (prescriptions for a beta-blocker (bisoprolol fumarate, carvedilol, or metoprolol succinate), prescriptions for ACEI or angiotensin receptor blocker (ARB), prescriptions for aldosterone antagonist for LVSD, prescriptions for a statin, and use of anticoagulation medications for atrial fibrillation).

Only two measures saw any notable improvement over the two year study period: discharge instructions counseling on weight, activity and medications (from 46.8% to 66.5%) and counseling for smoking (48.2% to 75.6%), but these were attributed to the use of preprinted discharge papers and checklists.

The other measures improved only slightly and several, such as ACEI and ARB prescriptions, did not change at all. Their computer model dredged through a mountain of data, looking for associations between their performance measures and improved outcome.

They concluded:

Hospitals participating in OPTIMIZE-HF demonstrated an increase in adherence to national guideline-recommended therapies over time. There were trends for reduction in in-hospital mortality, postdischarge death, and the combination of postdischarge death and rehospitalization ...The use of PrCI tools was positively associated with increased adherence to JCAHO core performance measures and improvements in in-hospital mortality rates. The results of OPTIMIZE-HF and other health care–improvement programs demonstrate that the quality of care provided to patients with cardiovascular disease can be enhanced by the use of patient data submission and performance feedback, concentrating on specific processes of care proven to improve outcomes.

According to the conclusions and the abstract, the country’s largest program for performance measures is a success. We’re led to believe that performance measures reduce deaths....

What the public didn’t hear was that none of those “trends for reduction” reached statistical significance: not post-discharge mortality, not rehospitalizations and not in-hospital mortality. For example: “During the program, in-hospital mortality rates improved slightly, from 3.5% to 3.4%, but the difference did not reach statistical significance.”

This “trend” was a mere 0.1% difference... from a computer.

This is another illustration that it’s important to learn to look past the doublespeak and not go by an abstract or take clauses out of context. And we certainly can’t take the media spin at face value.

The news has been reporting this study as proof of success for P4P measures, with headlining stories like “Hospitals Improve Heart Failure Outcome by Following National Guidelines “ by MedPage Today and “U.S. Heart Failure Program Is Saving Lives” from Forbes. As Forbes reported:

“If similar improvements had occurred at hospitals nationwide, this would translate to 40,000 less deaths and 1.4 million costly hospital days eliminated per year," principal investigator Dr. Gregg C. Fonarow, UCLA's Eliot Corday chair in cardiovascular medicine and science, director of the Ahmanson-UCLA Cardiomyopathy Center and professor of medicine at the David Geffen School of Medicine at UCLA, said in a prepared statement.

“Despite compelling scientific evidence and national guidelines for use of key life-prolonging agents and lifestyle changes, gaps exist in heart failure treatment," Fonarow said. “We hope more hospitals will adopt this validated model for enhancing heart-failure patient care."

Why the striking difference between the study’s actual findings of no improvement in patient outcomes and no statistical difference in mortality rates, and what the study authors are promoting? Perhaps, the financial disclosure at the end of the paper, and who funded the project — including designing and conducting the OPTIMIZE-HF registry, data collection and statistical analysis — may provide a clue (**see below) for this, as well as for the fact that performance measures invariably equate prescription meds and drugs with better medical care.

Reality check

At a time when healthcare dollars and resources are limited, most of us hope they’ll be used judiciously in ways that will actually help people live longer, healthier lives. Instead, despite the scarcity of evidence in support of P4P — even for critically ill patients as in these studies, let alone the vast majority of healthy adults — they are being intensely promoted. AMNews reports this week that at least 34 states are planning 47 new P4P programs over the next two years. Doctors will be increasingly graded on how well they measure up and we will be, too.

When you see your doctor’s grades for his “quality” of care, now you know what he/she is being tested on. A “bad” grade may just mean he/she doesn’t think all of those tests or taking a dozen pills for the rest of your life is grounded in good science or best for you.

© 2007 Sandy Szwarc

** Financial Disclosure: Dr Fonarow has received research grants from Amgen, Biosite Inc, Bristol-Myers Squibb, Boston Scientific/Guidant, GlaxoSmithKline, Medtronic Inc, Merck & Co, Pfizer, Sanofi-Aventis, Scios Inc, and the National Institutes of Health (NIH); has been on the speakers' bureau or has received honoraria in the past 5 years from Amgen, AstraZeneca, Biosite Inc, Bristol-Myers Squibb, Boston Scientific/Guidant, GlaxoSmithKline, Kos, Medtronic Inc, Merck & Co, NitroMed, Pfizer, Sanofi-Aventis, Schering-Plough, Scios Inc, St Jude Medical, Takeda, and Wyeth; has been a consultant for Biosite Inc, Bristol-Myers Squibb, Boston Scientific/Guidant, GlaxoSmithKline, Medtronic Inc, Merck & Co, NitroMed, Orqis Medical, Pfizer, Sanofi-Aventis, Schering-Plough, Scios Inc, and Wyeth; and reports editorial board involvement with American Heart Journal, Circulation, Journal of Cardiac Failure, Journal of the American College of Cardiology, and Reviews of Cardiovascular Medicine. Dr Abraham has received research grants from Amgen, Biotronik, CHF Solutions, GlaxoSmithKline, Heart Failure Society of America, Medtronic Inc, Myogen, NIH, Orqis Medical, Otsuka Maryland Research Institute, Paracor, and Scios Inc; has been a consultant or on the speakers' bureau for Amgen, AstraZeneca, Boehringer-Ingelheim, CHF Solutions, GlaxoSmithKline, Guidant, Medtronic Inc, Merck & Co, Pfizer, ResMed, Respironics, Scios Inc, and St Jude Medical; is on the advisory board of CardioKine, CardioKinetix Inc, CHF Solutions, Department of Veterans Affairs Cooperative Studies Program, Inovise, NIH, and Savacor Inc; has received honoraria from AstraZeneca, Boehringer-Ingelheim, GlaxoSmithKline, Guidant, Medtronic Inc, Merck & Co, Pfizer, ResMed, Respironics, Scios Inc, and St Jude Medical; and reports editorial board involvement with Congestive Heart Failure, Current Cardiology Reviews, Current Heart Failure Reports, Expert Review of Cardiovascular Therapy, Journal Watch Cardiology, PACE–Pacing and Clinical Electrophysiology, The American Heart Hospital Journal, and The Journal of Heart Failure. Dr Albert is a consultant for GlaxoSmithKline and Medtronic Inc; is on the speakers' bureau for GlaxoSmithKline, Medtronic Inc, NitroMed, and Scios Inc; is employed by the Cleveland Clinic Foundation; and reports editorial board involvement with Progress in Cardiovascular Nursing (senior editor), Journal of Cardiovascular Nursing, and Critical Care Nurse. Dr Gattis Stough has received research grants from Actelion, GlaxoSmithKline, Medtronic Inc, Otsuka, and Pfizer; is a consultant or on the speakers' bureau for Abbott, AstraZeneca, GlaxoSmithKline, Medtronic Inc, Novacardia, Otsuka, Protein Design Labs, RenaMed, Sigma Tau, and Scios Inc; and has received honoraria from Abbott, AstraZeneca, GlaxoSmithKline, Medtronic Inc, and Pfizer. Dr Gheorghiade has received research grants from NIH, Otsuka, Sigma Tau, Merck & Co, and Scios Inc; has been a consultant for Debbio Pharm, Errekappa Terapeutici, GlaxoSmithKline, Protein Design Labs, and Medtronic Inc; has received honoraria from Abbott, AstraZeneca, GlaxoSmithKline, Medtronic Inc, Otsuka, Protein Design Lab, Scios Inc, and Sigma Tau; and reports editorial board involvement with Acute Cardiac Care Journal (associate editor), American Heart Journal, American Journal of Therapeutics (associate editor), Archives for Chest Disease (associate editor), Current Cardiology Reviews, Expert Review of Cardiovascular Therapy, Heart Disease: A Journal of Cardiovascular Medicine, Heart Failure Reviews, Heart International, Journal of Cardiac Failure, Journal of the American College of Cardiology,Italian Heart Journal, The American Journal of Cardiology, The Journal of Heart Disease, and The Journal of Heart Failure. Dr Greenberg has received research grant support from Amgen, Cardiodynamics, GlaxoSmithKline, Millennium, Novacardia, Otsuka, Pfizer, Sanofi-Aventis, and Titan; is on the speakers' bureau or is a consultant for Amgen, AstraZeneca, GlaxoSmithKline, Guidant Corp, Medtronic Inc, Merck & Co, NitroMed, Pfizer, Remon Medical Technologies, and Scios Inc; is an advisory board member for CHF Solutions, GlaxoSmithKline, and NitroMed; has received honoraria from AstraZeneca, GlaxoSmithKline, Medtronic Inc, Merck & Co, NitroMed, Novartis, Pfizer, and Scios Inc; and reports editorial board involvement with Congestive Heart Failure and Journal of the American College of Cardiology. Dr O’Connor has received research grant support from NIH; is on the speakers' bureau and/or is a consultant for Amgen, AstraZeneca, Bristol-Myers Squibb, GlaxoSmithKline, Guidant, Medtronic Inc, Merck, NitroMed, Novartis, Otsuka, Pfizer, and Scios Inc; and has received honoraria from GlaxoSmithKline, Pfizer, and Otsuka. Mss Pieper and Sun are employees of Duke Clinical Research Institute (DCRI). Dr Yancy has received research grants from Cardiodynamics, GlaxoSmithKline, Scios Inc, Medtronic Inc, and NitroMed; is a consultant or on the speakers' bureau for AstraZeneca, Cardiodynamics, GlaxoSmithKline, Medtronic Inc, NitroMed, Novartis, and Scios Inc; is on the advisory board for CHF Solutions, the Food and Drug Administration cardiovascular device panel, and NIH; has received honoraria from AstraZeneca, Cardiodynamics, GlaxoSmithKline, Medtronic Inc, Novartis, and Scios Inc; and reports editorial board involvement with Circulation (guest editor), Congestive Heart Failure, Current Heart Failure Reports, Journal of Acute Cardiac Care, Journal of Urban Cardiology, and The American Heart Journal. Dr Young has received research grants from Abbott, Acorn, Amgen, Artesion Therapeutics, AstraZeneca, Biosite Inc, GlaxoSmithKline, Guidant, Medtronic Inc, MicroMed, NIH, Scios Inc, Vasogen, and World Heart; is a consultant for Abbott, Acorn, Amgen, Biomax Canada, Biosite Inc, Boehringer-Ingelheim, Bristol-Myers Squibb, Cotherix, Edwards Lifescience, GlaxoSmithKline, Guidant, Medtronic Inc, MicroMed, Novartis, Paracor, Proctor & Gamble, Protemix, Scios Inc, Sunshine, Thoratec, Transworld Medical Corporation, Vasogen, Viacor, and World Heart; and reports editorial board involvement with Journal of Heart and Lung Transplantation, Evidence-Based Medicine, Journal of the American College of Cardiology, American Heart Journal, Cleveland Clinic Journal of Medicine, Cardiology Today, Graft, TheHeart.org, Transplantation and Immunology Letter, and American Society of Transplantation Newsletter.

Funding/Support: This study was supported by GlaxoSmithKline.
Role of the Sponsor: GlaxoSmithKline was involved in the design and conduct of the OPTIMIZE-HF registry and funded data collection and management through Outcome Sciences, Inc, and data management and statistical analyses through DCRI. The sponsor was not involved in the management, analysis, or interpretation of data or the preparation of the manuscript. GlaxoSmithKline reviewed the manuscript before submission.

Bookmark and Share