Think Gene Think Gene RSS

a bio blog about genetics, genomics, and biotechnology

DNA Helix

“Tweaking” Experimental Data

Earlier today, I read a blog post by Mark Chu-Carroll titled, Selective Data and Global Warming. The post is primarily concerning a global warming “denialist” Michael Duffy who dishonestly presented global climate data to force it to fit his anti-global warming agenda. [1]

While reading it, I couldn’t help but be reminded that this type of dishonesty happens all the time in science. Most often, scientific experiments do not give simple conclusive results. The data must be “interpreted,” and statistical methods must be “applied.” I’ve seen cases where researchers sat and “tweaked” the statistics to favor their hypothesis with the same aggressive dishonesty as this global warming denialist.

Software for real time PCR machines is a perfect example of how dishonest representation of data has become so embedded in the industry of science. Most real time PCR software allows you to adjust parameters in the data interpretation. Why? While initial results may not support your hypothesis, the software makes it trivial to “play around” to make the data fit. The data itself is not changed —merely its interpretation. To avoid this problem, experiments should be repeated in different ways to ensure the interpreted results are the actual results. If several different runs with different controls and different samples are performed, the real results cannot be hidden with these manipulations. However, in my experience, experiments are only repeated if the results are not as expected.

Experiments don’t always work. So, experiments are supposed to be repeated several times with multiple levels of controls to ensure the reliability and accuracy of the results. However, I know far too many scientists who will take the first experimental results that match what they want, and then never repeat the experiment again. Or, if an experiment only produces the expected results 10% of runs, scientists will simply report the “good” results and ignore the rest, sometimes claiming “there must have been an error for 90% of the runs.” They are not faking data, but merely selecting the data they want rather than to uphold their scientific obligation to report reality without bias.

The greatest enemy of data integrity is Photoshop. Every scientist knows how to use Photoshop. It’s needed for many legitimate purposes, such as to prepare photos for publication. Unfortunately, it too is used to dishonestly manipulate data. For example, Photoshop can make a band on an agarose gel seem darker, lighter, or even combine different experiments together into one image. While sometimes these manipulations are perfectly acceptable, results can be mixed and matched to fit the hypothesis. There is simply no way to know from the final images if they were manipulated honestly —or manipulated at all.

So why do scientists “tweak” their data? Maybe vanity, or arrogance, but I think the real problem stems from the nature of scientific funding and the incessant pressure to publish, publish, publish. The livelihood of many scientists, especially those in the biological sciences, depends on NIH grants and other applicant funding. To get these grants and earn university tenure, scientists must show progress, and progress is measured in published papers. However, wrong hypotheses don’t publish papers —only right ones do. So if a scientist spends a year investigating a hypothesis, and it turns out that the data doesn’t support it, he often has a problem: publish or starve. So, the data is made to fit.

If these practices continue, it will seriously hurt scientific progress. While many scientists do follow correct practices and don’t manipulate their data or its interpretations, there unfortunately are also many who do.

[1] Mark refers to a post by Tim Lambert which refers to Michael Duffy at the Sydney Morning Herold

6 Comments

  1. Andy said,
    May 22, 2008 @ 3:26 am

    I don’t think it will escalate so disastrously.

    The more revolutionary the result, the more attention it will attract, and the more people will attempt to reproduce it and build on it. (i.e. and the quicker deception will become apparent)

    Thus, politics and competition push scientists to the unethical, and reality pushes them back!

  2. Josh Hill said,
    May 22, 2008 @ 3:29 am

    I agree. However, there’s still a lot of wasted money and resources, not to mention the slowing of scientific development.

  3. U.Penn Biologist caught photoshoping cultures to fake results | Think Gene said,
    May 29, 2008 @ 11:02 pm

    [...] week, Think Gene published an editorial (“Tweaking” Experimental Data, Josh Hill, 5.20.08 ) about the gross prevalence of “photoshopping” experimental data [...]

  4. Dave Eaton said,
    June 5, 2008 @ 8:58 pm

    Global Warming research is likely to get even more contentious. There have been many adjustments made to past data recently, ‘adjustments’ that may well be completely justified, but also need to be watched very carefully to make certain that grant-hungry researchers don’t try to make chicken soup out of chicken poop. Bad data and dubious arguments that lead to the right results are not science any more than data cooked by ‘denialists’.

  5. “The Pornography of Medicine” | Think Gene said,
    June 7, 2008 @ 5:06 pm

    [...] of research against the unrelenting pressure to pass nothing as something. This is something we think about a lot here at Think [...]

  6. John C said,
    June 14, 2008 @ 4:49 am

    While working at a Stanford genome lab, I discovered with amazement what Labview image processing tools can do do micro array data.

    I had at that time seen a lot of very nice micro array data done using open air spotters as opposed to using the then more expensive lab-on-a-chip type closed system Affymetrix chip.

    I just simply brought things to a sharp focus and presented the God’s truth at an instrumentation conference, dust fibers connecting the micro array dots and all. I exhorted that if anyone want to do things on the cheap, it better be in clean rooms of the quality being used by computer chip manufacturers (quite expensive).

    This made Affymetrix happy, but for a short while. Two physicists, who developed a better lab on a chip scheme than Affymetrics finally got their startup funding right after this conference!
    They were happier, especially when this all brought the cost of Affymetrix micro arraying down; their idea used some of Affymetrix technology which in turn cut their costs.

    As for all that other nice looking data … oh! Never mind!

Leave a Comment