Bad science soaring high

Vulture populations have declined now but bad science has taken their place in soaring high up.

Science is supposed to have a set of principles and researchers are expected to follow them. Further science is said to be self correcting and open minded debates are important for self correction to happen. The human mind has multiple inherent biases and scientists are not exceptions. Therefore conscious efforts are needed to come over the biases and follow the principles of science. Often it is difficult to ensure this for an individual and others opinion helps. But much science being published today does not seem to believe in this. On top of the inherent biases there are bad incentives created by publish or perish pressure, bibliometric indices, rankings and the like. For several reasons peer reviews fail to give the expected corrective inputs and often they actually add to the biases. An open peer review system, open debate is the only thing that I can see that might minimize, if not eliminate bad science.

There is a vulture related story of bad science soaring high, but before describing that let me illustrate the background with some anecdotes. Using bad statistics, flawed logic, badly designed experiments, inappropriate analysis, cooked up data are just too common in many papers in high impact journals coming from elite institutions. Because of the high prestige of the journals and institutions, bad science coming from there enjoys impunity. Any cross questions or challenges are suppressed by all means. Here are a few of my own experiences.

In 2010 a paper appeared in Current Biology from a Harvard group. A little later the corresponding author emailed me asking my opinion because the paper contradicted our earlier PNAS paper. I was happy because adversarial collaboration, (if not collaboration, at least a dialogue) is good for science. So a series of email exchanges began. At some stage I said I would like to have a look at your raw data. They had no problem in sharing it. It was huge and they had used a program written specifically to analyze it. We started looking at it manually although it appeared to be an impossible task. But very soon we discovered that there were too many flaws in the data itself. The program was apparently faulty and was picking up wrong signals and therefore reaching wrong conclusions. We raised a number of questions and asked the Harvard group for explanations. At this stage they suddenly stopped all communication. They had taken initiative in starting a dialogue, but when their own flaws started getting surfaced, they stopped all communication. We got no explanation for the apparent irregularities in the data. Interested readers will find the original email exchanges here (https://drive.google.com/file/d/164Jo15ydGgmCL4XvAwvivpnYtjoAagMQ/view?usp=drive_link ). At that time Pubpeer and other platforms for raising such issues were not established. Retraction was very rare and we did not think of these possibilities. We didn’t pursue the case with the journal for retraction. The obviously flawed paper remains unquestioned till today.

In 2017 a paper appeared in Science claiming that cancer is purely bad luck, implying that nothing can be done to prevent it. This came from a celebrity in cancer research, but had a very stupid fundamental mistake. The backbone of their argument was a correlation across different tissues between log number of stem cell divisions and log incidence of cancer. They said a linear relationship between the two indicates that only probability of mutation matters. The problem with the argument is that a linear regression on a log-log plot means linear relationship only if the slope is close to 1. Their slope was far away and therefore it actually showed a non-linear relationship but they continued to argue pretending that there was a linear relationship. Later using the same data set we showed that cancers are not mutation limited but selection limited, an inference diametrically opposite to theirs (https://www.nature.com/articles/s41598-020-61046-7 ). But we had hard time publishing this because we were directly challenging a giant in the field.

A long standing belief is that in type 2 diabetes controlling blood sugar arrests diabetic complications. We do not find convincing support to this notion in any clinical trial going by their raw data. But many published papers still keep on claiming this repeatedly. For doing so they need to violate many well known principles of statistics which they do coolly and publish in journals of high repute. We challenged this collectively (https://www.qeios.com/read/IH7KEP) as well as specifically challenged two recent publications that had obviously twisted data. One of them was published in Lancet Diabetes and Endocrinology and the other in PLOS Medicine. The editors of Lancet D and E refused to publish our comments (first without giving reasons and on insisting they gave very flimsy and illogical reasons) and the other one is still hanging. We then opened a dialogue in Pubpeer to which the authors haven’t responded. The reason that the lancet D and E reviewer gave for rejecting our letter are so stupid that they cannot give the same defense on Pubpeer because it is open. In confidential peer reviews the reviews don’t have to be logical. They can easily get away with illogical statements. This is well demonstrated by this case. This entire correspondence is available here (https://drive.google.com/file/d/1XNzxif4ybJdgAQ4YmiKg_6mSqZGh2Mn1/view?usp=drive_link ).

The case of whether lockdown helped in arresting the spread of infection during the Covid 19 pandemic is funnier. Just a few months before the pandemic WHO had published a report based on multiple independent studies on to what extent closing of schools or offices, travel bans etc can arrest transmission of respiratory infections (https://www.who.int/publications/i/item/non-pharmaceutical-public-health-measuresfor-mitigating-the-risk-and-impact-of-epidemic-and-pandemic-influenza ). The report clearly concludes that such measures are ineffective and therefore are not recommended. After the pandemic began, within just a few months so many leading journals published papers showing that lockdowns are effective. The entire tide turned in just a few months. All the hurriedly published papers have several flaws which nobody pointed out because of fear of being politically incorrect. Our analysis, on the other hand indicated that lockdowns were hardly effective in arresting the transmission (https://www.currentscience.ac.in/Volumes/122/09/1081.pdf ).

Repeated waves of infectious disease are caused by new viral variants is a common belief that has never been tested by rejecting a null hypothesis that a wave and a variant arise independently and get associated by chance alone (https://milindwatve.in/2024/01/05/covid-19-pandemic-what-they-believe-and-what-data-actually-show/ ). Our ongoing attempts to reject this null hypothesis w.r.t Covid-19 data have failed so far. In the absence of rejection of an appropriate null hypothesis, repeated surges are caused by new variants arising is no more than a religious belief. But it still constitutes the mainstream thinking in the field.

I suspect this is happening rampantly in many more areas today. My sample size is small and obviously restricted to a few fields of my interest. But within that small sample I keep on coming across so many examples of bad science published in leading journals that I haven’t been able to articulate all together yet. Some of the other potential fields where statistics is being misused commonly include the clinical trials related to the new anti-obesity drugs namely the GLP-1 receptor agonists. All the flaws in diabetes clinical trials are present in these papers too. Worse is the debate on what is a good diet that can keep away diabetes, hypertension, CVD and the like. Perhaps the same is happening in many papers related to the effects of climate change, effect of social media on mental health, human-wildlife conflict and so many other sentimentally driven issues.  

Added to this series is now the paper by Frank and Sudarshan in American Economic Review (https://www.aeaweb.org/articles?id=10.1257/aer.20230016 ) which claims that the decline of vulture populations in India caused half a million excess human deaths during 2000-2005 alone. The paper was given news coverage by Science (https://www.science.org/content/article/loss-india-s-vultures-may-have-led-deaths-half-million-people ) months before its publication. Because Science covered it, it became a headline in global media. The claim has perfect headline hunting properties and even Science falls prey to this temptation. Interestingly the data based on which the claim has been made was criticized by Science itself a couple of years ago. That time for estimating Covid-19 deaths in India the death record data from India was dubbed unreliable now the same data becomes reliable when the inference can make attractive headlines. Other set of data used by the authors is equally or much more unreliable. Further the methods of analysis used are also questionable. Everything put together makes a good case for intentional misleading and all the reasons why I say this are available here (https://www.qeios.com/read/K0SBDO ).

What is more interesting is the way it happened. On realizing that the analysis in this paper has multiple questionable elements, we looked at the journal where it was intended to be published. Most journals have a section called letters or correspondence whether readers can cross-question, raise issues or comment in any other form on  a published article. American Economic Review where this paper was accepted for publication, did not have such norms. This is strange and very clearly against the spirit of science. Suppressing a debate violates the chances of science being self correcting. In the absence of a self correcting element science is no different from religion. So logically AER is not a scientific journal but a religious one, let’s accept that. Nevertheless we wrote to the editor that we want to raise certain issues. The editor advised us to correspond with the authors directly. We wrote to the authors. Unlike the Lancet D and E and other examples mentioned above, in this case the authors were responsible enough and replied promptly. There were a few exchanges at the end of which we agreed on a few things but the main difference did not get resolved. This is fair; disagreement is a normal part of science. But if a difference of opinion remains, it is necessary that arguments on both the sides should be made available to the reader.

The authors were also kind in making links to their raw data available to us. This entire correspondence is here (https://drive.google.com/file/d/1d91UzBCMAY9Q3Nu5Yc5_Iqm7ycWPmTWR/view?usp=sharing ). We analyzed their data using somewhat different but statistically sound approach and did not reach the same inference. It is normal that a given question can be addressed using more than one equally valid analytical approaches. Robust patterns give the same inference by any of the approaches. If it turns out to be significant with one approach but not by another it is an inherently fragile inference. It turned out so. We tried again with AER. They asked us to Pay 200 dollars as “submission fees” and that is the most interesting part of the story. The original peer review of the paper was obviously weak because none of the issues that we raised seem to have surfaced in that. I am sure the journal did not pay the reviewers. What we did was an independent peer review itself, and for this we were charged $ 200!! We paid in the interest of science although we knew that the journal will not publish our comments. But it was necessary to see on what basis it rejects. Apparently one reviewer seems to have commented on our letter. We had raised about 11 issues out of which the reviewer mentions only 3 and says that they are not convincing without giving reasons why they are not. No mention about the rest of the issues raised. This could be obviously because they had no answers to them. This is clearly cherry picking. Where the reviewers themselves are cherry picking, what else can we expect from published articles? For interested readers the entire correspondence related to the rejection is available here (https://drive.google.com/file/d/1mq2e8sKiYKUMNtUjQKnleaAPm3a8oDZO/view?usp=sharing ).

The authors seem to have incorporated some corrections in response to our correspondence (without acknowledging us, but that is a minor issue) but the main problem remained unresolved. Now we have independently made our comments public on a platform which is open to responses by anyone. This openness is the primary requirement of science and if that is getting blocked the entire purpose of publishing become questionable.

The importance of this incidence goes much beyond the vulture argument. It is about the way in which science is practiced in academia. Throughout all the stories mentioned above I am only talking about statistical mistakes that seriously challenge the inference. There can be many mistakes which do not change the major conclusions and I am not talking about them. I am a small man and have a very small reach. If in my casual exploration I came across so many cases of misleading science, how large the real problem would be?

Last year over 10,000 papers were retracted and 2024 is likely to have a much larger number. The most common problem detected is image manipulation and that is because a method to detect image manipulation is more or less standardized. Detecting and exposing fraud has become one of the most important activities for science. The Einstein Foundation Award going to Elisabeth Bik (https://www.einsteinfoundation.de/en/award ) for her relentless whistle blowing has endorsed this fact. Scientists exposing bad science are most valuable to science. But the number of retractions is still the tip of the iceberg and it is only for certain types of frauds. How many papers should be retracted for misleading statistics? How many for claiming causal inference without specifically having causal evidence? Nobody has any idea. A paper appeared just last week showing that as large as 30 % of papers published in Nature, Science and PNAS are based on statistical significance that is marginal and fragile (https://www.biorxiv.org/content/10.1101/2024.10.30.621016v1). Many others use unreliable data, cherry picked data, use indices or protocols inappropriate for the context or ignore violation of assumptions behind a statistical test. All added together, my wild guess is that two thirds to three forth of science published in top ranking journals (perhaps more) will turn out to be useless. I am not the first one to say this. This has been said by influential and well placed scientists multiple times (for example https://pmc.ncbi.nlm.nih.gov/articles/PMC1182327/ ).  

In spite of the dire state of affairs bad science continues to get published in big journals, coming from elite institutions. Why does this happen? I think the reason is that career and personal success overpowers the interest in science. Scientists are more interested in having high impact publications in their names rather than revealing the nature of reality, addressing interesting questions, ensuring robust and reproducible outcomes and seeking really useful solutions. Headline hunting has priority over getting true insights. This they do at any cost and by any means. Even flagship journals like Science seem to be more interested in making headlines than supporting real science.

The other question is why readers get fooled so easily? I think two factors are involved. First academics themselves have no incentive to check reproducibility, access raw data and do independent analysis etc. This type of work doesn’t give high impact prestigious publications and therefore it is a waste of time for them. Whether the quality of science is affected is no more a concern for academics. What academics will never do can be done by citizen scientists. But academic publishers do anything and everything to deter them from being vigilant. Look at the AER behaviour who charged us $ 200 in order to make a futile attempt to cross-question misleading science. Citizen scientists are by default treated badly as an inferior or untouchable caste. The individual cost benefits of initiating a healthy debate are completely discouraging. As a result science is viewed as a monopoly of academics and others are effectively kept away. The net result is that science is losing trust. I no more consider science published in big journals and coming from elite institutions as reliable without looking at raw data. But it is not possible to look at raw data every time, so I will advise my readers to simply stop believing in published science. Giving it only as much importance as to social media posts is the only option left. Academia treats peer reviewed articles as validated howsoever biased and irresponsible peer reviews may be. This is a matter of faith just as much as believing water of Ganges as pious, howsoever polluted it may be.  

I will end this article with a gratification note. I am extremely thankful to those who published bad science, especially all the examples above which I could study in details. I am primarily a science teacher and teaching the fundamental principles and methods of science is my primary job. Illustrating with real life examples is the most effective way of teaching. All the above mentioned papers have been giving me live examples of how not to do science. I share these papers to make the students realize. I hope, at least the next generation of researchers receive this training at an early stage.

“Nature” on citizen science vs the nature of citizen science:

The 3rd October 2024 issue of Nature has an editorial on citizen science (https://www.nature.com/articles/d41586-024-03182-y). It has some brilliant and successful examples of involving volunteers outside formal academia in doing exciting science. But unwritten in the article are the limits of citizen science as perceived by academia. On the other hand, I have examples which go much beyond what Nature Editors see. Whether to call them successful or not the readers can decide by the end of this article.

In all examples that the Nature Editorial describes, volunteers have been used as free or cheap skilled labor in studies that mainstream academics designed; for kind of work that needed more manual inputs; where AI was not yet reliable; hiring full time researchers was being unaffordable; they could save time and money by involving volunteers.

In contrast I have examples where citizens’ thinking has contributed to concept development; to design and conduct of experiments, where the problem identification itself is done by citizens; where novel solutions are perceived, worked out and experimentally implemented by people formally unqualified for research; where citizens have detected serious errors of academics or even exposed deliberate fraud by scientists. I would certainly say that this is far superior and the right kind of use of collective intelligence of people. What citizens can’t do is the formalism of articulating, writing and undergoing the rituals needed to publish papers where academics may need to help. But in several respects citizens are better than academics in pursuing science.

I have described in an earlier blog article the work that we did with a group of farmers during 2017-2020 (https://milindwatve.in/2020/05/19/need-to-liberate-science-my-reflections-on-the-scb-award/). This started with a problem faced by the farmer community itself, to which some of us could think of a possible solution. Thereafter farmers themselves needed to understand the concept, design a working protocol based on it, taking it to an experimental implementation stage and maintain their own data honestly. Then back to trained researchers who analyzed the data, developed the necessary mathematics etc. By the time this was done I had decided to quit academia and my other students involved in this work also had quit for different reasons. The entire team was outside academia when the major chunk of work was done and we could do it better because we were free from the institutional rituals. This piece of work received an international award ultimately. Here right from problem identification farmers, including illiterate ones, were involved in every step except the formal mathematics, statistical analysis and the publication rituals.

In January 2024, I posted on social media that anyone interested in working with me on a range of questions (including the ones they themselves have) may contact me. The response was so large that I couldn’t handle so many people. I requested that someone from the group should take the responsibility of coordinating the group so that maximum use of so many interested minds can be made. This system did not take shape as desired because of many unfortunate problems coincidently faced by all the volunteering coordinators themselves. But a few volunteers continued to work and a number of interesting themes progressed. They ranged from problems in philosophy and methods of science to identifying, studying and handling problems faced by people.

One of the major patterns in this model of citizen science involves correcting the mistakes of scientists writing in big journals, some of which we suspect were intentional misleading attempts. For example, we came across a paper in The Lancet Diabetes and Endocrinology (TLDE) which was a follow up of an interesting clinical trial in which using diet alone they had claimed substantial remission of type 2 diabetes in one year. Their definition of diabetes remission was glucose control and freedom from glucose lowering medicines. After 5 year follow up they claimed that the group under the diet intervention who achieved remission by the above definition had significantly low frequency of diabetic complications. When we looked at their raw data, it certainly did not support their conclusion. They had reached this conclusion by twisting data and cherry picking on the results. Peer reviews never look at such things if it is coming from one of the mainstream universities. This is not a baseless accusation, there is published data showing the lop-sided behaviour of peer reviewers.

The true peer reviewers need to be the readers. But in academia nobody has time to read beyond the name of the journal, title and at the most abstract. The conclusions written at the end of the abstract are taken as final by everyone, even when they are inconsistent with the data inside. This is quite common with the bigger journals of medicine. The reason academics are not interested in challenging such things is that it takes a long time and painstaking efforts by the end of which they are not going to get a good original publication. The goal of science has completely changed in academia and the individual value of publishing papers in big journals has completely replaced the value of developing insights in the field. Since anyone in academics cannot do the job of preventing misleading inferences, citizens have to do it. Citizens can do what academics can’t because number of papers and journal impact factors don’t shape their career anyway. Citizen science should focus on doing things that people in academia cannot or may not. That is the true strength of citizen science. Since people in academia seem to be least bothered about the increasing fraudulent science, citizens outside academia will have to do this.

In this case, after redoing statistical analysis ourselves, we wrote a letter to the editor of TLDE, who responded after a long time saying that the issues you raised appear to be important and she will send the letter to the authors to respond. Then nothing happened for a long time again.  On sending reminders the editor responded saying that our letter was sent to a reviewer (no mention of what the authors’ response was) and based on the reviewer’s views it was rejected. The strange thing was that the reviewer’s comments were not included in the editor’s reply. After insisting on seeing the reviewer’s comments they were made available. And amazingly (or perhaps not surprisingly) the reviewer had done even more selective cherry picking on our issues. He/she gave some sort of “explanawaytions” to some of them. For example we had raised an issue that when you do a large number of statistical tests some are bound to turn out individually significant by chance alone. Therefore just showing that you got significance in some of them is not enough. This is a well known problem in statistics and there are solutions suggested. The reviewer said something to the effect that the solutions suggested are not satisfactory for us and hence we pretend that the problem does not exist!! The reviewer completely ignored the issues for which he/she did not have any answer. So the reviewer was worse than the authors. Then we published our comments on Pubpeer (https://pubpeer.com/publications/BB3FA543038FF3DF3F83B449F8E5AA) to which the authors never responded. This entire correspondence with TLDE can be accessed here (https://drive.google.com/file/d/16zjYPeKcz0JEnlrjSXP4p1QUimdBEPFy/view?usp=sharing). The absence of author response and the fully entertaining reviewer response makes it clear that the illogical statistics was intended to mislead and not an oversight.

Two more fights are underway and I will write about them as soon as they land up here or there. Either the paper needs to be retracted/corrected or our comments published along with the paper. But this will be detrimental to the journal as well as author reputation, so it is very unlikely. A more likely response will be that they will simply reject our comments or do nothing about anything. In either case I will make the entire correspondence public. In recent years a large number of papers are being retracted (over 10,000 in 2023, perhaps much more in 2024). A large number of them are because of image manipulation. But that is because the technique of detecting image manipulation is there now. I suspect a much greater number needs to be retracted for screwing up statistics with intentional misleading, or simply to get the paper accepted. Who will expose this? In my view this is beyond the capacity and motivation of academics and therefore this should be a major objective of citizen science.

I have no doubt that many people outside academia can acquire the skill-set to do so. All that is needed is common sense about numbers. Technical knowledge about statistical tools is optional. Most of the problems in these papers were the kind of misuse of statistics that a teacher like me tells the first year students not to do. In the quality of data analysis the scientists publishing in big journals are inferior to our first year students. I have seen many more examples of this earlier.

Detecting frauds in statistics is not difficult, but the further path is. The system of science publishing has systematically made the further path difficult. In a recent case, a paper had fudged data and had reached misleading conclusions in very obvious ways. The peer reviewers should have detected it very easily, but they failed. When a group of volunteers pointed out the mistakes, reanalyzed the raw data showing that the inferences were wrong; the editors said – submit your comments through the submission process and the submission process includes a $200 submission fee. I am sure the journal did not pay the earlier reviewers anything. And when someone else did a thorough peer review, he/she is being penalized for doing a thorough job!! This is how science publishing works.

In a nut shell, many in academia are corrupt and citizen scientists are likely to do much better science. But academics know this and therefore hurdles are purposely being created so that their monopoly can be maintained. The entire system is moving towards a kind of neo-Brahmanism where common man is tactfully kept away from contributing to knowledge creation. Multiple rituals are created to keep people away effectively. The rituals in science publishing are increasing for the same purpose. I am sure this was the way brahminical domination gradually took over in India. Now the entire world of science is moving in the same direction. Confidential Peer review and authors charges are the two tools being effectively used for monopolization. There is a need that citizens become aware and prevent this right at this stage. I see tomorrow’s science safer and sound at the hands of citizens than with academia. This is the true value and true potential of citizen science. Since academia is actively engaged in suppressing this kind of citizen science, we the science loving common people need to take efforts to keep it alive.

A reason to welcome AI in science publishing:

More and more concern is being raised about the problems in academia that are rapidly expanding both qualitatively and quantitatively. Hardly anyone will disagree that there is a reproducibility crisis, increasing frequency of frauds and misconducts at every level. The burden of APCs is destroying the level playing field (if there was any) so that only the rich can publish in prestigious journals.  The bibliographic indices have almost taken away the need to read anything, because the importance of any piece of work is gauged by the journal impact factor; the performance of any researcher by the number of papers. So nobody reads research papers anymore. Citing them does not need reading them anyway. The rapidly changing picture in academia is a perfect case of proxy failure where proxies have completely devoured the goal of research. Now asking novel questions, getting new insights and solving society’s problems is no more the goal of research. Publishing papers in prestigious journals and grabbing more and more grants is. With this a downfall of science is bound to happen, and actual downfall that has already begun is also well demonstrated by many published studies.

An additional serious concern now is AI. Of late so many researchers are using AI to write papers whose apparent quality of presentation is often better than what researchers themselves could have written. At present AI tools have many obvious flaws and they get caught red handed quite often. Incidences in which hallucinating AI cited references that did not exist have come to light. But soon AI will evolve to be better and then it will make it harder to detect flaws. In response to the first wave of AI generated papers, some journals banned them, but soon implementing such things will become impossible. An arms race of smarter frauds and smarter whistleblowers is not exactly going to be good for science.

Who will benefit the most from the more refined AI tools? Certainly the people involved in research misconduct because the frauds will become increasingly smarter and more difficult to detect as AI gets smarter.

And precisely for this reason I would welcome the use of AI in science publishing because it can get us out of the mess created by us over the last few decades. The mess has been created by the ‘publish or perish’ narratives that nurtured bad incentives. The set of bad incentives gave rise to the journal prestige rat race, the citation manipulation practices, the predatory journals as well as the prestigious robber journals exploiting the researchers’ desperation to publish. AI will help us come out of this mess not by smarter detection of fraud and misconduct, but by enhancing it and making it more and more immune to detection.

It is already become a common knowledge that many papers are being written with substantial contribution from AI. There is yet to be any example where a deep and disruptive insight is contributed by AI. The limiting factor in AI generated science is still going to be scientists who will decide to accept or reject what AI output is saying. If AI gives an output that goes against the current beliefs and opinions in the field, it is most likely to be rejected saying that sometime AI throws up junk (which might be true at times, but who knows) and we need not take every output as true. So AI will make normal routine and ritualistic science more rapid. It will give more easily and efficiently what people in a field are already expecting. But I doubt whether it will be of any help in a Kuhnian crisis.

But AI will tremendously help those who want to strengthen their CVs deceptively by increasing their number of publications, and blowing up citations. This trend will increase so sharply that it will soon collapse under its own weight. As writing papers become easier, their value in a CV will fall rapidly. Institutions will have to find alternative means to “evaluate” their candidates and employees. The clerical evaluation based on some numbers and indices will become so obviously ridiculous that it will have to give way to curious and critical evaluation which is not quantifiable and which cannot be done without efforts and expertise. Critical thinking, disruptive ideas, reproducibility will have to come to the front seat and replace bibliographic indices. Nothing can be more welcome than this. If the importance of number papers published and citation based indices vanishes, most bad incentives for fraudulent practices will vanish in no time.  Paper mills and peer review mills will collapse. There can no more be profiteering either by predatory or by robber journals. The edema of science will vanish because it will be no more confused with growth. So let AI disturb, disrupt and destroy the mainstream science publishing systems and that is my only hope to save science from its rapidly declining trustworthiness.

This will certainly happen but not very smoothly. There will be a decade or more of utter crisis and chaos, which has started already. The entire credibility of academia will be at stake. People will cross question the very existence of so many people in academia drawing fat salaries and contributing nothing to real insights.  If academics are prudent, they should start the process of rethinking sooner to shorten the period of crisis and chaos. Vested interests of publishers and power centers in academia will not allow this to happen so easily. There attempt will be to keep things entangled in the arms race. But there still are sensible people in science, aren’t they? The only thing sensible is to allow the current system of science publishing to get crushed under the weight of the AI assisted edema and then make a fresh beginning where HUMANS and not some computed indices make value judgments, identify, attach and appreciate real and insightful contributions to science and make a community of innately curious and investigative minds with no added incentives and rewards. Then let them take help of AI for anything. When the human rat race has vanished, AI will be tremendously useful for its positive contributions.

A solution to the reviewer problem

Not getting anyone agreeing to review is a growing concern of journal editors. Often editors have to send requests to 10-20 potential reviewers so that one or two agree; keeping the time commitment, the repeated reminders required apart. Editors therefore have a tough job. They make it easy partly by making the invitation and reminder process automated. But that doesn’t address the root cause of the problem.

We had an interesting experience over the last couple of years. For one of our papers, three journals one after the other, took substantial time each with no progress shown on the online updates. After substantial delay, (6 months in one case) they returned the MS saying that they did not find any editor/reviewer agreeing to handle/review the MS. The 6 month delay case was with PLOS One which actually has a large network of editors and reviewers from different fields. I am sure they tried their best to find an appropriate person to agree, but with no result. This is just one example. Experiences across the board show that the problem of not getting reviewers is genuine and widespread.

Then something unexpected (not unexpected for me though!!) happened to our paper. After spending almost two years, trying out many journals and not getting a single review, we decided to publish it in the open peer review journal Qeios. The journal has an AI system to send requests to reviewers in the field. In addition, any reader is welcome to post comments. All comments get publicly posted immediately. The author responses and revisions are also open in the public domain. On posting the paper on this platform something miraculous happened. Reviews started flowing in within a couple of weeks. Today after about 7 weeks, there are 16 reviews received. I just completed replying the reviewers and posting a revised MS. This response contrasts the prior experience of not getting a single review in two years in multiple journals in the prevalent system of confidential peer reviews. So the contrast is between zero reviews in two years versus 16 reviews in 7 weeks.

How does the quality of reviews compare? No comparison specific to this paper is possible because there was no review in the traditional system. A systematic comparative analysis across the two systems with sufficient sample size is not possible unless some of the traditional journals make their reviews public. The traditional journals just want to avoid getting exposed so they will never do this. So I can only talk anecdotally. In my experience with prior publications, there is no difference between the two in the average quality of reviews. On an average both are equally bad. A minority of reviews are really thoughtful, rigorous, appreciating and critical on appropriate issues and therefore useful to improve the quality of the paper. For the reviews on this paper I would say a greater proportion of reviews turned out to be insightful and useful for revision. But there were poor quality reviews as well. Here is the link to the revised paper, all the reviews and replies (https://www.qeios.com/read/GL52DB.2). Readers are welcome to make their own opinion.

Talking about  my sample size of a few hundred reviews, majority of reviews have been very superficial, saying nothing critically about the central content and making a few suggestions like improving figure 3b or making the introduction more concise. Some comments such as more clarity is needed are universally true and therefore completely useless unless some specifics are pointed out. A comment about language correction is so common. When the author names sound non-British, a suggestion to get the manuscript refined by a native English speaker has to be there.  I tried taking help from professional language service and even after that this comment had to be there. Some reviews are entirely stupid and irresponsible. Precisely this entire range is seen in the reviews received in open peer review system. But the proportion of sensible reviews appears to be slightly greater in open peer review system.

Why is it that conventional journals found it impossible to get reviewers and for the same paper open peer review system had so many reviews in a short time? I can see two distinct but interdependent reasons. One is the accept reject dichotomy. Any kind of debate, differences of opinions and suggestions are useful and important for science. But a reject decision stops this process. The accept-reject dichotomy has actually defeated the purpose of peer reviews. The Qeios system takes this dichotomy away. One of the beliefs behind confidential peer reviews has been that by being anonymous reviewers can avoid the rage from authors and the resultant personal bitterness. The scientific community actually welcomes debates of any kind. They are irked by rejection which takes away the opportunity to debate. Here the authors are always given the freedom to respond and therefore reviews do not end up irritating the authors despite critical comments. Reviewers are not afraid of spoiling the relations and a healthy debate is possible. I suspect that if the social fear is taken off, reviewers actually like to publish the comments with their identity disclosed. The second factor contributing to reviewers’ willingness is that they get a credit for their review and a feeling of participating in the refinement of the paper and thereby progress of the field. This is the true and healthy motivating factor. Other suggestions such as paying the reviewers are unlikely to have the same motivational effect. Reviewers actually seem to like the idea of their comments getting published. This I think is the reason why we received 16 reviews for a paper that did not get a single reviewer in confidential review system.

A promising inference coming from this experience is that there is a simple solution to the problem of reluctance of reviewers and that is to remove confidentiality, discourage anonymity and make the peer reviews public. If accept reject decisions are necessary at all, let a discussion between the editor and authors decide it. Reviewers need not give any accept-reject recommendation. They only write their critical views. If the reviews expose fundamental flaws in the manuscript, authors themselves would like to either remove them or withdraw. If they don’t, their reputation suffers because the flaws pointed out by the reviewers also get published along with the paper.

All this can work only with the assumption that there are readers that actually read the papers and the comments as well. About this I am not so sure or hopeful. The culture of making a judgment without reading has gripped the entire field very tightly. I can only hope that when the reviews become open, readers will stop confusing between peer review and validation. Reader will stop relying on the blind faith that reviewers have done a good job and what they read now is reliable and true. Instead, readers would use their own judgment aided by the peer comments and as a result the reading culture will have to improve. If the entire community continues to make some quick judgments only based on the name of the journal, at the most reads only the title and abstract and feels no need to read further, then only God can save science.