Have you heard? New dietary guidelines are out! Do you care? Probably not much. Should you? I’m honestly not sure. Even I have had trouble keeping track of what is in and what is out of these current guidelines, given how much we’ve fought over them in the media. But more importantly, we experts keep changing our minds about what we think you should eat. In large part I think it boils down to the fact that good science is hard to do, especially that which applies to large populations of healthy people.
I’ve discussed the conflict of interest in being an expert in a previous post. So let’s set aside the issues inherent in our current medical system, in which the public has been given the expectation and so demands that we experts keep everyone indefinitely healthy, and focus on medical science itself.
It is relatively easy to make observations in biomedical science, and now even easier with the adoption of electronic medical records. Existing analyses and stored data can be analyzed in huge quantities to find correlations. But while correlations are interesting, and highly instructive in developing theories, they cannot reliably determine cause and effect.
Here’s an example of a misleading correlation: I hate buckling my seat belt on an airplane when the seat belt sign goes on in midflight. When I do, it causes the plane ride to become rough. It happens 100% of the time, so I must be causing it. Ridiculous, in fact, but not far from the kind of logic used to draw causality from correlation.
Let’s look to government statistics for another famous example. Combining statistics from the US Department of transportation and the US Department of Agriculture, it is immediately clear that the importation of lemons is fully responsible for a dramatic reduction in traffic fatalities1.
Similarly ridiculous. But these examples are only ridiculous because we know the underlying facts. These kinds of correlations have led us astray again and again in science, until finally properly tested.
There are a number of examples from my specialty, nutrition. Fiber intake is associated with a lower colon cancer rate, but supplementation of fiber has no effect. Worse, dietary antioxidant intake is associated with lower prostate cancer incidence, but supplementation of vitamin E or selenium increases prostate cancer incidence.
To establish cause and effect, studies must be randomized. This means that a group of subjects must agree to randomly receive one treatment or another (or placebo). If the test population is close to identical, such as a group of syngeneic (i.e., genetically identical) lab rats, and the effect difference between the groups is large, only a small number of subjects are required. But if the subject population is heterogeneous or the effect size small, such as in most human studies, the number of subjects must be huge for the results to be reliable. The statistical test for this is called “power”. When underpowered, the results of randomized trials have a strong possibility of being due to random effect or design bias, and cannot be fully trusted. In human nutrition studies, tens to hundreds of thousands of subjects are required to adhere to the assigned diets for adequate power. And because the effects we are looking for take years to accrue, subjects must be fully compliant for decades to be able to measure an effect.
For a study to prove anything, the subjects have to actually adhere to the intervention. This is often problematic. Dietary studies are famous for this. In many dietary studies, only a small percentage of subjects adhere to the diet. Only now are we starting to study how to obtain adherence in studies as a mainstream scientific discipline.
Once a randomized trial has demonstrated effect, the results of the trial may be unusable or unsuitable. There are many examples where the manner the therapy is administered, or the context in which the study was conducted, make it impracticable or inappropriate to be able to apply the results. The clinical trial may include an extraordinary level of support to the clinicians to insure that delivery of a highly complex intervention is successfully applied. That may level of support not be practicable in a typical clinical setting. The intervention, then, is of no value if this level of support is not available outside of the study.
Also, randomized trials seek to study a fairly homogenous group to minimize the sample size. But by narrowing the qualifications to enter the study, the subjects in studies often have no resemblance to those in the real world.
Much of our research, as a result, is done in small underpowered studies. To compensate, we perform conglomerations of the data, such as meta-analyses. These take multiple studies and enter the results into a new statistical model, combining the results into a larger model. Often times, meta-analysis finds statistically significant results where individual studies may not. A meta-analysis may even conclude the opposite of some of the individual studies within it by reporting the aggregate. The fact that a small study comes up with the “wrong” result is not at all surprising. This is exactly why adequate power is needed. While far from perfect, and subject to biases such as those created by the selection of which studies to include, meta-analysis is a powerful tool to help instruct clinical care in the absence of large, well designed, randomized studies.
Properly designed large prospective randomized studies are enormously expensive. Research to create techniques for behavioral interventions that are effective and durable is just beginning to bear fruit. Should you believe the next earthshattering study? First read the fine print.
- Original source unknown. Earliest reference found at:
We would love to keep you informed about new entries from our physician contributors.
Please let us know your email address to subscribe to the blog.