Discovery Stack Preamble

Overview:

Welcome to the Discovery Stack/Discovery Curator (DS/DC) Pilot experiment – we are glad you are joining or considering joining. We, like you, are scientists. We solve problems, and we think there are many problems with the current publishing systems that have been exposed by the pandemic and preprint submissions. We have looked at why previous attempts to reform publishing have failed and have concluded that the underlying driving-forces in the system (e.g. profits for publishers) may be a root cause. With the benefit of hindsight, we think the entire infrastructure would be better built by first-imagining how to positively align our brains when new information is brought forward. Then, to honor our collective need to understand not only whether we think science is high-quality but also and separately to identify science that is ‘current’, ‘important’ or ‘impactful’. Our goal is to devise experiments that can help us move from the place in publishing where we are today to a better one where we hope a series of experiments will land us.

Long-term vision:

We envision a publishing model of peer reviewed preprints for which Quality (Q) and Impact (I) scores are initially assigned by a refined process of ‘Peer Improvement’ and can evolve by crowd-sourcing reviewer participation over time. Unlike fixing the evaluative metrics of a research article to the impact factor of a journal in any given moment—a relationship that has been shown to have nearly zero correlation— each article will be individually subjected to the test of time, in near real-time. We hypothesize that Q scores are likely to be fairly static, whereas I scores representing evolving knowledge and the input of many, will have the potential to evolve as the true impact of any body of work is determined by the timelessness of its correctness and not subject to the needs for journals to select the most-likely-to-be impactful results on the basis of just 2-3 individuals. This long-term vision may not be feasible and a pilot experiment is how we in science determine how and whether to take the next step. Determining if this vision is a worthwhile pursuit, and then investing the resources to make it happen depends on your participation in this experiment. We value and thank you for your time, thought, and input.

The DS/DC pilot experiment is designed to test the following hypotheses:

Community-organized, journal-independent peer review of preprints can expedite the time from preprint submission to a peer-reviewed metric that allows for scientific curation by tying ourselves more tightly to the ownership and curation of our science.
We will be better able to assess the quality of our science (“Q” score), and its importance/impact (“I” score), if we work on each assessment separately and in sequence because a community-based measure of the quality of a study’s experimental designs, execution, and conclusions is an essential foundation for the assessment of its impact.

A long-term hypothesis that is outside the scope of this experiment is for these scores to be subject to ongoing concurrence and adjustments by the community through open-ended reviewing and score assignment. This in turn allows non-anonymous peer-reviewers to have their comments subject to evaluation so that they and we can become better and more trusted curators of scientific value. With this total reconfiguration in mind, we posit that we can arrive at a more just process of helping one another. The segregated metrics of Q and I, and their ongoing adjustments, will be owned by the community and can be used to evaluate the merits of a particular study for purposes of personal literature curation as well as investigator evaluation and promotion.

We as a community can help authors improve the quality of their science by re-orienting the review process to a line-edit workflow that reminds us we are ultimately seeking to help authors ensure that statements made about a set of data are correct.
Adopting a peer-improvement mindset (by both reviewer and authors) as the guiding principle for review can shift the author-reviewer relationship from one of potential tension and conflict-of-interest to one of collaboration that results in a more equitable system, better product, and happier researcher mindset about the publication process, the product, and our peers.
Peer review of preprints by the DS/DC method would be more transparent and eliminate questionable decisions by professional editors, including activist editors whose efforts to ‘improve’ the quality of papers in their journals is of questionable value relative to the detrimental impact on the pace of scientific progress (see Brierely, et al., ’22).

In addition, the quality of non-anonymous reviewer contributions to scientific curation will be evaluated by the community to help ensure a more collaborative and collegial process for both authors and reviewers.

This will be work, but should feel good

Cognitively, most of us do not feel good when we peer-review in the current system. We would like to be improving science overall but know that our work is largely being used to determine commercial value. The negative impacts of activist editors (and reviewers) who think that their ideas and requests are valuable to science have been roundly critiqued. Yet, when reviewing, most of us would enjoy being able to suggest rigor in the work of others, much as we do for students and peers when asked informally.

In Q review, we anticipate that the process will ‘feel’ like you are seeking to help a colleague to make their existing science correct— the workflow of Q review consists largely of this. It should feel like asking a colleague to help prepare work for publication. Your goal should both be to help them make the impact as high as possible but also prevent them from making statements that are not well-supported. Your analysis of a figure and its description in legends and a results section should seek to make sure that statements made are justified—upon seeing an experiment that doesn’t support an existing statement, we would hope that you comment that one solution to an overstatement would be to soften the conclusion (e.g. sometimes data can only ‘suggest’ a conclusion but an author has claimed ‘shows’ or proves. Softening the language is a reasonable request). You might also offer that the authors would need to throw in a control in order to make a statement. It is fair if you offer an opinion as to why the author’s conclusion simply cannot be made on the basis of a particular experiment and to offer thoughts as to what an experiment might show. In doing this ‘in line’ we hope that ‘Peer improvement’ engages your system 2 analytical brain, that you read for quality of science. At this point you are not trying to determine whether this work has the most impact (is “CNS” level) but simply, is it solid experiments and reasonable conclusions. Conclusions of the quality of work should be possible to make independent of who should read it.

We know that change is hard and that this may have a higher or different cognitive load, at least at first. We appreciate your trying this out. However, we hope we will find that this is cognitively easier in the end. Here, you should be applying your brain to questions of the veracity of statements and your analysis of the quality of experiments. We know that it is only with time, repetition of a result, and attempted extensions of that result that we truly understand its value.

Whether you apply your ‘star’ of approval for any given journal to grace this paper with presentation in its TOC is not the question here and you need not conflate your reputation as an accurate reader with your reputation as a ‘picker’ of the biggest results or even to make statements about whether the data is or is not ‘novel’. We believe that the analysis that is sought in Q review is really the most valuable and accurate thing you’ve ever been able to do when faced with a journal article at this stage. We believe that three readers in this mode provide more value to ‘branding’ a piece of science for its actual quality and to helping authors be their best scientists, as compared to the current system.

We have thought this system through enough to know that while we all also need you to provide your assessment of ‘impact’ (I) so that we all know what to read each week, we also know that this question is both different and ultimately much more subjective and ultimately benefits greatly from additional input, beyond your work here. In the world as we imagine it, you will be able to separate this opinion (is it the most novel science ever, does it move the needle—what is its importance—also “I’) ) from the question of whether it is a quality set of experiments—>conclusions.

We can apply our skills in new areas

This pilot is one in the realm of social engineering and yet it is one that we are the best to try since most of us know what we wish the system would do for us but does not. It is not the kind of experiment we are used to. It is at its core a social engineering experiment meant to determine if we can improve upon the needed components of the existing system, add novel components, and do without the rest. In setting up this experiment, we believe that the value of the current system (aka ‘Publishing’) can or should offer some degree of quality control, quality assessment, and understanding and communication of the ‘best’ science of the moment and of the past. We also believe that those who generate the product for the biomedical research publishing industry (we, the scientists) can and will be able to take ownership of our product(s) in a way that makes our work more rapidly available to our colleagues (and the tax paying public who support us) with metrics that will allow for evaluation(s) of an individual study and the specific details of that study. Furthermore, because we propose non-anonymous reviewing, where a reviewer’s work is also evaluated, we will be able to assess the quality of an individual’s contribution to the curation process (much in the way Uber passengers and drivers are rated, although in our case with the benefit of continual iteration of the system to fight the gaming of the system). The benefits of determining just how well an alternative system can work would allow us to help authors, science-funders, and advancement/hiring committees to move away from the using the impact factor of a journal (or similarly the journal’s name or ‘brand’) to evaluate a publication, a correlation which is actually very low (Paulus et al. 2018, Waltman and Tragg, 2020 , Finardi, 2013, Lozano et al. 2102).

If you are reading this… we anticipate you are agreeing to be a test subject and represent data points in this experiment. We intend to publish the results of this experiment once the data are crunched, evaluated, and reviewed. As with any experiment, this is expected to challenge basic assumptions and long-held views. We ask that you reflect on your current views about how you approach the reviewer process (for practiced reviewers, this would be System 1 thinking), suspend those views when experimenting with the peer-improvement mindset to work your way through a new way of approaching the problem (System 2 thinking), and then objectively compare the two approaches at the conclusion. This is likely to be hard, and for that reason we hope it will be rewarding regardless of the view you ultimately take-home.

Experimental design:

Solicit manuscripts in the field of immunology that have recently been submitted to bioRxiv as well as conventional peer review at a conventional journal.
Solicit reviewers that are willing to provide peer-improvement feedback to authors of the manuscript via line-by-line copy-editing and offer a Quality (Q) and Impact (I) score.
Survey authors and reviewers for first impressions of the process and for reflective impressions. Suggestions for modifying the process will be solicited at this later time point.

Pitfalls:

There are many pitfalls that we have considered through the course of designing this experiment, and there are undoubtedly many we have not anticipated. The one we ask you to focus on is that this experiment may yield results that suggest the peer improvement DS/DC model is viewed less favorably than the current system because, within the constraints of the time allotted to this experiment, the number of review experiences (n) with peer improvement will largely be limited to n=1, whereas most of you will have had n>100 experiences in the classic review format. As a result, we think that trying the new approach, as with anything new, will take more time, could be frustrating, and might at first strike you as less efficient than what you are used to. We expect it will. We ask that you be mindful of this experience imbalance, consider the System 1 vs 2 thinking noted above, and ask yourself if, with experience, you think the peer improvement DS/DC model would make you happier with the reviewing process, the end product, and publishing overall.

Alternatives:

As the conclusion of participating in this experiment, you will have the opportunity to support leaving the system as is, or to suggest implementing further changes in the final survey. As with all experiments, we will have the opportunity to propose and engage in a next iteration of the study. We can do experiments until we get them right, and we believe the product (well-defined and evaluated science) is worth the effort.

Written and Complied by

Max Krummel is a professor at UCSF where his research focuses on the spatial and temporal dynamics of immune systems. Current studies include the definition of cDC1 as primary centerpieces of reactive immune systems in cancer through to discoveries of archetypal states of immune systems across the body. High-resolution and high-dimensional microscopy—from cell motility and synapses, through to multicellular dynamics—have been and remain focal points of his research.

Mike Kuhns is a Professor in the Department of Immunobiology at the University of Arizona. His lab's current interests including: 1. analyzing the evolutionary history of proteins that drive T cell activation to identify heretofore unknown signaling networks that drive T cell function; 2. biomimetic engineering of synthetic receptors to redirect T cell activity.