Bad science soaring high

Vulture populations have declined now but bad science has taken their place in soaring high up.

Science is supposed to have a set of principles and researchers are expected to follow them. Further science is said to be self correcting and open minded debates are important for self correction to happen. The human mind has multiple inherent biases and scientists are not exceptions. Therefore conscious efforts are needed to come over the biases and follow the principles of science. Often it is difficult to ensure this for an individual and others opinion helps. But much science being published today does not seem to believe in this. On top of the inherent biases there are bad incentives created by publish or perish pressure, bibliometric indices, rankings and the like. For several reasons peer reviews fail to give the expected corrective inputs and often they actually add to the biases. An open peer review system, open debate is the only thing that I can see that might minimize, if not eliminate bad science.

There is a vulture related story of bad science soaring high, but before describing that let me illustrate the background with some anecdotes. Using bad statistics, flawed logic, badly designed experiments, inappropriate analysis, cooked up data are just too common in many papers in high impact journals coming from elite institutions. Because of the high prestige of the journals and institutions, bad science coming from there enjoys impunity. Any cross questions or challenges are suppressed by all means. Here are a few of my own experiences.

In 2010 a paper appeared in Current Biology from a Harvard group. A little later the corresponding author emailed me asking my opinion because the paper contradicted our earlier PNAS paper. I was happy because adversarial collaboration, (if not collaboration, at least a dialogue) is good for science. So a series of email exchanges began. At some stage I said I would like to have a look at your raw data. They had no problem in sharing it. It was huge and they had used a program written specifically to analyze it. We started looking at it manually although it appeared to be an impossible task. But very soon we discovered that there were too many flaws in the data itself. The program was apparently faulty and was picking up wrong signals and therefore reaching wrong conclusions. We raised a number of questions and asked the Harvard group for explanations. At this stage they suddenly stopped all communication. They had taken initiative in starting a dialogue, but when their own flaws started getting surfaced, they stopped all communication. We got no explanation for the apparent irregularities in the data. Interested readers will find the original email exchanges here (https://drive.google.com/file/d/164Jo15ydGgmCL4XvAwvivpnYtjoAagMQ/view?usp=drive_link ). At that time Pubpeer and other platforms for raising such issues were not established. Retraction was very rare and we did not think of these possibilities. We didn’t pursue the case with the journal for retraction. The obviously flawed paper remains unquestioned till today.

In 2017 a paper appeared in Science claiming that cancer is purely bad luck, implying that nothing can be done to prevent it. This came from a celebrity in cancer research, but had a very stupid fundamental mistake. The backbone of their argument was a correlation across different tissues between log number of stem cell divisions and log incidence of cancer. They said a linear relationship between the two indicates that only probability of mutation matters. The problem with the argument is that a linear regression on a log-log plot means linear relationship only if the slope is close to 1. Their slope was far away and therefore it actually showed a non-linear relationship but they continued to argue pretending that there was a linear relationship. Later using the same data set we showed that cancers are not mutation limited but selection limited, an inference diametrically opposite to theirs (https://www.nature.com/articles/s41598-020-61046-7 ). But we had hard time publishing this because we were directly challenging a giant in the field.

A long standing belief is that in type 2 diabetes controlling blood sugar arrests diabetic complications. We do not find convincing support to this notion in any clinical trial going by their raw data. But many published papers still keep on claiming this repeatedly. For doing so they need to violate many well known principles of statistics which they do coolly and publish in journals of high repute. We challenged this collectively (https://www.qeios.com/read/IH7KEP) as well as specifically challenged two recent publications that had obviously twisted data. One of them was published in Lancet Diabetes and Endocrinology and the other in PLOS Medicine. The editors of Lancet D and E refused to publish our comments (first without giving reasons and on insisting they gave very flimsy and illogical reasons) and the other one is still hanging. We then opened a dialogue in Pubpeer to which the authors haven’t responded. The reason that the lancet D and E reviewer gave for rejecting our letter are so stupid that they cannot give the same defense on Pubpeer because it is open. In confidential peer reviews the reviews don’t have to be logical. They can easily get away with illogical statements. This is well demonstrated by this case. This entire correspondence is available here (https://drive.google.com/file/d/1XNzxif4ybJdgAQ4YmiKg_6mSqZGh2Mn1/view?usp=drive_link ).

The case of whether lockdown helped in arresting the spread of infection during the Covid 19 pandemic is funnier. Just a few months before the pandemic WHO had published a report based on multiple independent studies on to what extent closing of schools or offices, travel bans etc can arrest transmission of respiratory infections (https://www.who.int/publications/i/item/non-pharmaceutical-public-health-measuresfor-mitigating-the-risk-and-impact-of-epidemic-and-pandemic-influenza ). The report clearly concludes that such measures are ineffective and therefore are not recommended. After the pandemic began, within just a few months so many leading journals published papers showing that lockdowns are effective. The entire tide turned in just a few months. All the hurriedly published papers have several flaws which nobody pointed out because of fear of being politically incorrect. Our analysis, on the other hand indicated that lockdowns were hardly effective in arresting the transmission (https://www.currentscience.ac.in/Volumes/122/09/1081.pdf ).

Repeated waves of infectious disease are caused by new viral variants is a common belief that has never been tested by rejecting a null hypothesis that a wave and a variant arise independently and get associated by chance alone (https://milindwatve.in/2024/01/05/covid-19-pandemic-what-they-believe-and-what-data-actually-show/ ). Our ongoing attempts to reject this null hypothesis w.r.t Covid-19 data have failed so far. In the absence of rejection of an appropriate null hypothesis, repeated surges are caused by new variants arising is no more than a religious belief. But it still constitutes the mainstream thinking in the field.

I suspect this is happening rampantly in many more areas today. My sample size is small and obviously restricted to a few fields of my interest. But within that small sample I keep on coming across so many examples of bad science published in leading journals that I haven’t been able to articulate all together yet. Some of the other potential fields where statistics is being misused commonly include the clinical trials related to the new anti-obesity drugs namely the GLP-1 receptor agonists. All the flaws in diabetes clinical trials are present in these papers too. Worse is the debate on what is a good diet that can keep away diabetes, hypertension, CVD and the like. Perhaps the same is happening in many papers related to the effects of climate change, effect of social media on mental health, human-wildlife conflict and so many other sentimentally driven issues.  

Added to this series is now the paper by Frank and Sudarshan in American Economic Review (https://www.aeaweb.org/articles?id=10.1257/aer.20230016 ) which claims that the decline of vulture populations in India caused half a million excess human deaths during 2000-2005 alone. The paper was given news coverage by Science (https://www.science.org/content/article/loss-india-s-vultures-may-have-led-deaths-half-million-people ) months before its publication. Because Science covered it, it became a headline in global media. The claim has perfect headline hunting properties and even Science falls prey to this temptation. Interestingly the data based on which the claim has been made was criticized by Science itself a couple of years ago. That time for estimating Covid-19 deaths in India the death record data from India was dubbed unreliable now the same data becomes reliable when the inference can make attractive headlines. Other set of data used by the authors is equally or much more unreliable. Further the methods of analysis used are also questionable. Everything put together makes a good case for intentional misleading and all the reasons why I say this are available here (https://www.qeios.com/read/K0SBDO ).

What is more interesting is the way it happened. On realizing that the analysis in this paper has multiple questionable elements, we looked at the journal where it was intended to be published. Most journals have a section called letters or correspondence whether readers can cross-question, raise issues or comment in any other form on  a published article. American Economic Review where this paper was accepted for publication, did not have such norms. This is strange and very clearly against the spirit of science. Suppressing a debate violates the chances of science being self correcting. In the absence of a self correcting element science is no different from religion. So logically AER is not a scientific journal but a religious one, let’s accept that. Nevertheless we wrote to the editor that we want to raise certain issues. The editor advised us to correspond with the authors directly. We wrote to the authors. Unlike the Lancet D and E and other examples mentioned above, in this case the authors were responsible enough and replied promptly. There were a few exchanges at the end of which we agreed on a few things but the main difference did not get resolved. This is fair; disagreement is a normal part of science. But if a difference of opinion remains, it is necessary that arguments on both the sides should be made available to the reader.

The authors were also kind in making links to their raw data available to us. This entire correspondence is here (https://drive.google.com/file/d/1d91UzBCMAY9Q3Nu5Yc5_Iqm7ycWPmTWR/view?usp=sharing ). We analyzed their data using somewhat different but statistically sound approach and did not reach the same inference. It is normal that a given question can be addressed using more than one equally valid analytical approaches. Robust patterns give the same inference by any of the approaches. If it turns out to be significant with one approach but not by another it is an inherently fragile inference. It turned out so. We tried again with AER. They asked us to Pay 200 dollars as “submission fees” and that is the most interesting part of the story. The original peer review of the paper was obviously weak because none of the issues that we raised seem to have surfaced in that. I am sure the journal did not pay the reviewers. What we did was an independent peer review itself, and for this we were charged $ 200!! We paid in the interest of science although we knew that the journal will not publish our comments. But it was necessary to see on what basis it rejects. Apparently one reviewer seems to have commented on our letter. We had raised about 11 issues out of which the reviewer mentions only 3 and says that they are not convincing without giving reasons why they are not. No mention about the rest of the issues raised. This could be obviously because they had no answers to them. This is clearly cherry picking. Where the reviewers themselves are cherry picking, what else can we expect from published articles? For interested readers the entire correspondence related to the rejection is available here (https://drive.google.com/file/d/1mq2e8sKiYKUMNtUjQKnleaAPm3a8oDZO/view?usp=sharing ).

The authors seem to have incorporated some corrections in response to our correspondence (without acknowledging us, but that is a minor issue) but the main problem remained unresolved. Now we have independently made our comments public on a platform which is open to responses by anyone. This openness is the primary requirement of science and if that is getting blocked the entire purpose of publishing become questionable.

The importance of this incidence goes much beyond the vulture argument. It is about the way in which science is practiced in academia. Throughout all the stories mentioned above I am only talking about statistical mistakes that seriously challenge the inference. There can be many mistakes which do not change the major conclusions and I am not talking about them. I am a small man and have a very small reach. If in my casual exploration I came across so many cases of misleading science, how large the real problem would be?

Last year over 10,000 papers were retracted and 2024 is likely to have a much larger number. The most common problem detected is image manipulation and that is because a method to detect image manipulation is more or less standardized. Detecting and exposing fraud has become one of the most important activities for science. The Einstein Foundation Award going to Elisabeth Bik (https://www.einsteinfoundation.de/en/award ) for her relentless whistle blowing has endorsed this fact. Scientists exposing bad science are most valuable to science. But the number of retractions is still the tip of the iceberg and it is only for certain types of frauds. How many papers should be retracted for misleading statistics? How many for claiming causal inference without specifically having causal evidence? Nobody has any idea. A paper appeared just last week showing that as large as 30 % of papers published in Nature, Science and PNAS are based on statistical significance that is marginal and fragile (https://www.biorxiv.org/content/10.1101/2024.10.30.621016v1). Many others use unreliable data, cherry picked data, use indices or protocols inappropriate for the context or ignore violation of assumptions behind a statistical test. All added together, my wild guess is that two thirds to three forth of science published in top ranking journals (perhaps more) will turn out to be useless. I am not the first one to say this. This has been said by influential and well placed scientists multiple times (for example https://pmc.ncbi.nlm.nih.gov/articles/PMC1182327/ ).  

In spite of the dire state of affairs bad science continues to get published in big journals, coming from elite institutions. Why does this happen? I think the reason is that career and personal success overpowers the interest in science. Scientists are more interested in having high impact publications in their names rather than revealing the nature of reality, addressing interesting questions, ensuring robust and reproducible outcomes and seeking really useful solutions. Headline hunting has priority over getting true insights. This they do at any cost and by any means. Even flagship journals like Science seem to be more interested in making headlines than supporting real science.

The other question is why readers get fooled so easily? I think two factors are involved. First academics themselves have no incentive to check reproducibility, access raw data and do independent analysis etc. This type of work doesn’t give high impact prestigious publications and therefore it is a waste of time for them. Whether the quality of science is affected is no more a concern for academics. What academics will never do can be done by citizen scientists. But academic publishers do anything and everything to deter them from being vigilant. Look at the AER behaviour who charged us $ 200 in order to make a futile attempt to cross-question misleading science. Citizen scientists are by default treated badly as an inferior or untouchable caste. The individual cost benefits of initiating a healthy debate are completely discouraging. As a result science is viewed as a monopoly of academics and others are effectively kept away. The net result is that science is losing trust. I no more consider science published in big journals and coming from elite institutions as reliable without looking at raw data. But it is not possible to look at raw data every time, so I will advise my readers to simply stop believing in published science. Giving it only as much importance as to social media posts is the only option left. Academia treats peer reviewed articles as validated howsoever biased and irresponsible peer reviews may be. This is a matter of faith just as much as believing water of Ganges as pious, howsoever polluted it may be.  

I will end this article with a gratification note. I am extremely thankful to those who published bad science, especially all the examples above which I could study in details. I am primarily a science teacher and teaching the fundamental principles and methods of science is my primary job. Illustrating with real life examples is the most effective way of teaching. All the above mentioned papers have been giving me live examples of how not to do science. I share these papers to make the students realize. I hope, at least the next generation of researchers receive this training at an early stage.

Leave a comment