Writing Tips: An Authorship Policy that Maximizes Collaboration

(This post is authored by Eleazar Eskin.)

Assigning authorship and determining the order of authors on scientific papers is an issue that every research lab deals with. Authorship ranking can be a frequent source of conflict among members of a research lab. In many labs, multiple students involved in a project compete for first- or high-ranking authorship throughout the life of the project. Competition for authorship in a lab culture lacking a clearly-defined policy disincentives students from obtaining other lab members’ help, because the project leader may ultimately lose their first-authorship position to the students they recruit for help. These issues can reduce the quality of inter- and intra-lab communication and collaboration. Ultimately, authorship conflict can reduce lab productivity, create lots of bad feelings, and, in some cases, poison the work environment.

Here we share our labs’ authorship policy. Of course, the actual authorship of papers published in our lab reflects the amount of work and contributions that each author made to the project.  However, during the course of the project, there are ample opportunities for different members of the group to contribute more or less than anticipated. Acknowledging flexibility in contribution amount and authorship shapes the final ranking and achieves several other goals. First, our authorship policy is designed to encourage inter- and intra-lab collaboration, increase the overall productivity of the lab, expand training opportunities for students in the lab, and improve the overall productivity of each individual member of the lab.

In our lab, we use the following key principles to assign authorship:

  • No last minute changes. It takes months (if not years!) to complete a research project and finish writing a paper describing the process. Many authorship conflicts arise just before paper submission, which can be a hectic process even without disagreement. In our lab, we never make last minute changes on the eve of submission.

    Instead, we explicitly address author-order issues after the paper is submitted. The revision process always requires more work, so there is plenty of time to resolve conflicts in a calm and constructive way. We often contact journals to change the author order after submission of original and revised manuscripts, and we have even changed the order of authorship on accepted papers just before submitting a camera-ready version. The advantage of this policy is that we remove the majority of drama in authorship conflicts.

  • Each student has their own first-author projects. Competition for high authorship ranking is inevitable in academia, where one’s publication record has a crucial impact on their career. In addition, graduate students in many programs are required to publish a specific number of first-author papers in order to complete their degrees.

    In our lab, each student has clearly defined projects that lead to first-author papers. Except in exceptional circumstances, such as leaving the lab before finishing their project, the student will be the first author of the paper. Other students in the lab are welcome to join the project and contribute (with the first-author student’s permission). In this case, they have authorship rights but cannot dislodge the project leader’s first-author position. With the authorship outcome established in advance, each student involved has clear expectations and can budget their involvement in the project accordingly.

    The advantage of this policy is that we no longer have students competing for first-author positions. Students are genuinely encouraged to collaborate and obtain help from peers in their projects. In addition, junior students often recruit senior students to help with their first projects. This is a win-win scenario; senior students benefit from the mentoring experience, and the advanced graduate students’ research experience often substantially speeds up completion of the junior students’ project. In addition, encouraging senior students to help with all lab papers also lightens my mentorship load and frees up time for my research, writing, and teaching.

  • First-author students help determine the author order. Lack of a clear protocol for determining authorship ranking throughout the course of a project can lead to conflicts as publication nears. In our lab, we create a culture of granting authorship-assigning agency to the first author student, who has substantial input into the author order and is responsible for monitoring their co-authors’ productivity.

    During the course of the project, the first-author student is responsible for gently nudging them to contribute if a student co-author has contributed relatively little time to the project. If a co-author has contributed a tremendous amount, the first author can decide that the two students should share first author. The advantage of this policy is that the first-author student has a lot of ownership over their projects and is responsible for ensuring, over the course of the project, that the workload is split to reflect the final authorship ranking.

  • Students are recognized for pulling more than their anticipated weight. Many very talented students often substantially contribute to multiple projects in the lab, including the other students’ papers in which they are not the first author.

    In our lab, we greatly encourage this behavior. I explain to the students that I will notice their additional investment of time and effort, and they will be recognized for this in letters of recommendation that I will write for them. I also explain to students interested in taking on extra project workloads that the experience and recognition—regardless of their specific authorship ranking on each project—will provide for them many future opportunities and collaborations.

Our policy leads to a highly collaborative environment where each student who graduates from the lab co-authors a paper with the majority of the other students in the lab. Senior students gain invaluable experience mentoring junior students through the paper writing process. When they graduate from the lab, my students are very generous with credit and authorship to others involved in the lab. This makes me proud of them as both scientists and people.

Even with this policy, I would say that every six months to one year we have an authorship conflict among lab members that I must get involved in. In part, this is because authorship is so discrete that, even with the best intentions, the constraints of a ranked list sometimes fail to completely reflect the individuals’ contributions. Using joint first authors and joint corresponding authors can help with this issue, but jointly-authoring still may not introduce sufficient complexity to accurately reflect the efforts and contributions of all individuals involved. However, the collaborative culture of our lab, as well as our collaborative relationships with other labs, usually helps us resolve these disputes in short time.

Writing Tips: Why we Publish Methods Papers

by Eleazar Eskin

Computational genomics is a field where many diverse academic groups collaborate, each bringing to a project their own distinct academic cultures.  In particular, each academic discipline involved in computational genomics has its own publication strategy in terms of the types of papers they publish and how they package methods and results in these papers.  Publishing papers is extremely important to careers in academia and science, because all scientists are reviewed for tenure or promotion based on our publications records.  An important factor in our review (unfortunately) is the impact factor of the journals that we publish in.  Here, we describe our lab’s publication strategy and the reasoning behind it.

Our lab is a computational lab, and the main contribution of our lab to Bioinformatics is the development of methods for solving important biological problems, particularly in the area of genetics.  These new methods are implemented in software packages that (hopefully) are used by others to enable biological discovery.  Naturally, the key papers our group produces are papers that describe and explain potential applications of these new methods.

Roughly speaking, there are two strategies for publishing methods in our field.  The first is to focus on writing methods papers that are primarily dedicated to describing the computational advances.  The second is to focus on publishing our novel methods as part of more comprehensive papers that present a biological contribution. In this case, our method is primarily described in the supplementary materials. Over the span of my career, I have seen computational researchers receive more pressure to follow the second strategy in order to have papers published in a high impact journal.  Unfortunately, following the second strategy often delays publication (sometimes for years), because peer review often involves applying the method to a new dataset and/or performing extensive functional validation.

Our group primarily follows the first strategy.  In addition, we work with other groups and, as collaborators, publish papers focused on biological contributions.  This strategy works out well for us, and we feel that writing methods-focused papers is the best way for us to make a contribution to science.  We hope that other computational biology groups will follow our example and publish more methods papers.

Here are some of the reasons we feel this is a good strategy:

  1. Doing Justice to our Work. We can fully explain the methods only in papers dedicated to methodology. Since our contribution is methods, the best way to push the science forward is to clearly describe our method and the context of its development and application. In a dedicated paper, we are most likely to have enough space to fully describe the method and explain how the approach works.  Methods papers also have the space (and are typically required) to compare the proposed method with previous methods. This comparison puts the performance of the paper in perspective to the work of others.  Methods papers ideally provide enough details that other groups can build upon our method and compare their results to our published results. Sharing authorship on these papers also allows students who were involved in the development of these methods to demonstrate their strong technical skills.  In my view, computational biologists should be evaluated by the quality and impact of their methodology development and departments when making hiring decisions should consider this impact.  The impact can be measured by the number of users of the software implementing the methods, the number of citations of the papers describing the methods and the discoveries that these methods have enabled.  These factors are more important than the impact factor of the journals where the methods are published.
  1. Self Determination of Publishing. There are no outside bottlenecks preventing us from finishing our papers quickly, and we can control the publication process of our papers. A methods paper is primarily written by members within our lab, and authors evaluate the method using both simulated and established datasets.  This structure means we need not wait for outside collaborators or experiments to finish.  Finishing the paper faster means that have more time to work on new papers.
  1. Increased Number and Improved Quality of Collaborations. The methods paper is a widely-distributed, often freely available, finished product, and many prospective collaborators approach us after reading a paper from our group. More importantly, in our collaborations, we have very little competition over authorship.  Students in the group are happy to work hard on a project just to be in the middle of the collaborative paper, because they already are first author on their own methods papers.  Our methods development students are not competing for credit with the students in the collaborators group.
  1. Project Longevity. Writing a methods paper forces the method to be finished, evaluated, and documented, and publishing the paper forces us to release the software. This process encourages the project to have more longevity. Once the method is fully developed, new students can easily pick up and build upon the previous method.  Once a student leaves the lab, the method can persist with new lab members as it is stable, well-documented, and de-bugged.  Long after they have left the lab, many of the students who wrote methods papers in our group continue to author papers related to applications of their method.

In full disclosure, we do identify one negative aspect of the methods paper publishing strategy.  High impact papers require collaborations, and it is less likely that methods developers can publish high impact journals as a senior or corresponding authors.  While it is less likely to occur, members of our lab do occasionally gain senior authorship in high impact journals through collaboration.  We have found that the combination of methods papers, where you are the senior or first author, and high impact papers, where you have middle authorship and it is clear that your role was the application of the method, is overall a positive outcome and looks good in your publication record.

For example, Eran Halperin and I published a 2004 paper in the lower-impact journal Bioinformatics that described the HAP haplotype phasing method.  The HAP method was later used in a Perlegen-led paper that was published, with Halperin and I as co-authors, in the notably high-impact journal Science. The 2005 Science paper helped me get my job at UCLA; it was clear what my contribution was as I also authored the methods paper in Bioinformatics.

Our lab has produced several other examples of methods papers paired with high-impact collaborations. Kang et al. (2008) presents the EMMA method in Genetics (impact factor of 5.963), and a collaboration with the Jake Lusis group on the HMDP presents results in Genome Research (impact factor of 11.351) (Bennett et al. 2010).  More recently, we published the CAVIAR method (Hormoziari et al., 2014) in Genetics and collaborated with Dan Geschwind’s group in applying the method to a Nature paper (Won et al. 2016).

Citations of papers mentioned in this post:

Sorry, no publications matched your criteria.

Writing Tips: Results Subsections

The purpose of a Results section is to present, without interpretation, the key results of your research. Your paper does not need to include every result you obtained during your experiments. Results are “key” when they are relevant to addressing the research questions or hypotheses presented at the beginning of your paper.

We use the Results subsections to show the reader what types of outcomes they can expect when using the methodology that we present. In our papers, we write a “Methods Overview” as the first subsection of the Results section. (We discuss writing the “Methods Overview” subsection in a previous writing tips post.) Remaining subsections in your paper’s Results section present your findings in the form of text, figures, and tables.

Each Results subsection should make a specific point, and the subsection heading should be a succinct description of this message. Effective subsection headings declare a statement that communicates to the reader what the method is capable of doing or what types of data the method can be applied to. For example, in a recent paper published by our group, the heading of a subsection that demonstrates how a new GWAS approach controls for false positive results is: “Phenotype Imputation Controls Type 1 Error.”

Here, a two-paragraph Results subsection has a heading that tells the reader which specific type of analysis is discussed, since the paper presents a method that can be applied toward numerous different analytical tasks.

Cell type composition and diversity

 

We hypothesized that differences in microbial diversity may be linked to whole blood cell type composition. Since the actual cell counts were not available for these individuals, we used cell-proportion estimates derived from available DNA methylation data to test this hypothesis (Houseman et al. 2012; Aryee et al. 2014; Horvath and Levine 2015).

 

We assessed methylation data from 65 controls from our replication sample, and compared methylation-derived blood cell proportions to alpha diversity after adjusting for age, gender, RIN, and all technical parameters. We tested whether alpha diversity levels are associated to cell type abundance estimates. Our analysis shows one cell type, CD8+ CD28- CD45RA- cells, to be significantly negatively correlated with alpha diversity after correction for all other cell-count estimates (correlation = -0.41, P=7.3e-4, Figure S6, Table S6). These cells are T cells that lack CD8+ naïve cell markers CD28 and CD45RA and are thought to represent a subpopulation of differentiated CD8+ T cells (Koch et al. 2008; Horvath and Levine 2015). We observed that low alpha diversity correlates with high levels of this population of T cells cell abundance.

 

Total RNA Sequencing reveals microbial communities in human blood and disease specific effects

Sorry, no publications matched your criteria.

For each subsection, we include one figure that illustrates the heading’s message. The figure’s legend (also referred to as a “caption”) can simply be the subsection heading with additional information explaining the methods and data involved in the visual output. It may be helpful to select a figure and write a legend before composing text for the subsection.

At this point, you could probably write an entire paper on each figure! In general, we limit the text in each Results subsection to one to two paragraphs. Here, we use the minimum amount of text that is necessary to walk our reader through the figure. Think about what the reader needs to know in order to start using the method for their own analysis. Relevant information includes the type of data used, analytical steps and parameters, and a summary of conclusions. In many cases, the subsection text and figure legend will be repetitive.

This one-paragraph section provides relevant results in terms of statistical parameters, numerical output, and a supplemental figure. This subsection gives the reader a good idea of what to expect if they want to incorporate this new approach in their own project.

Phenotype Imputation Controls Type I Error

 

We simulated datasets for multiple phenotypes under the null model where the variant we are testing has no effect (effect size of zero) toward the target phenotype. We computed the type I error under five different significance thresholds: 0.05, 0.01, 0.005, 5 3 10-6, and 5 3 10-8. We generated 100,000,000 simulated datasets that consist of 1,000 individuals. The type I error rates for our imputation method were 0.049, 0.0099, 0.00489, 4.90 3 10-6, and 4.89 3 10-8 for the significance thresholds of 0.05, 0.01, 0.005, 5 3 10-6, and 5 3 10-8, respectively. This indicates that the type I error is correctly controlled in our imputation method. The Northern Finland Birth Cohort dataset 13 was used to show that the type I error is controlled (see Figure S1). We plot the Q-Q plot of the Z-score for the imputed triglyceride (TG) phenotype from the Finland dataset. There is no inflation in the Q-Qplot as shown in Figure S1.

 

Imputing Phenotypes for Genome-wide Association Studies

Sorry, no publications matched your criteria.

Bonus challenge: After you finish writing your paper, try to remove the sentence highlighting the result’s importance from the Figure caption.

The order in which you present your results can be organized in many different ways. Typically, ordering of subsections is not important for initial manuscripts. One simple approach is to order Results subsections sequentially to support the argument that you are building in your paper.

Here, we present another example of a Results subsection, including the description of a relevant figure. The subsection heading is making it clear to the reader that this part of the paper discusses applying ForestPMPlot, a visualization tool for analyzing meta-analysis studies, to eQTL data.

Application to multi-tissue eQTL analysis

 

One powerful application of our proposed framework is in multi-tissue eQTL analysis in the Genotype-Tissue Expression (GTEx) project. The GTEx project studies human gene expression and genetic regulation in multiple tissues, providing valuable insights into the mechanisms of gene regulation, which can lead to the new discovery of disease-related perturbations. In this project, genetic variation between individuals will be examined for correlation with differences in gene expression level to identify regions of the genome that influence whether, and by how much, a gene is expressed. In particular, examining multiple tissues can give us valuable insights into the genetic architecture of the regulatory mechanism, because many regulatory regions are known to act in a tissue specific manner (Ernst et al. 2011; Encode Project Consortium 2012). Hence, understanding the role of regulatory variants, and the tissues in which they act, is essential for the functional interpretation of GWAS loci and insights into disease etiology.

 

Figure 2 is an example of the output of ForestPMPlot for a multitissue eQTL study for SEMA3B gene (GTEx Consortium 2015). Examining both the forest plot and the PM-Plot allows us to obtain an insight into the tissue-specific genetics effects in eQTL analysis, which leads to the identification of three significant eQTL tissues (heart left ventricle, stomach, and thyroid). This example clearly shows that examining both the forest plot and the PM-Plot allows us to easily hypothesize that there is a specific group of studies showing tissue differences in eQTL analysis.

 

ForestPMPlot: A Flexible Tool for Visualizing Heterogeneity Between Studies in Meta-analysis

Sorry, no publications matched your criteria.


Below, we provide examples of several different types of figures that can illustrate the point of a Results subsection.

Example of a figure and figure caption that clearly illustrate and explain significance of results in a Results subsection (Hormozdiari et al. 2016).

Example of a figure and figure caption that clearly illustrate and explain significance of results in a Results subsection (Hormozdiari et al. 2016).

 

Example of a more complex figure and figure caption in a Results subsection, which aim to explain the advantages of a new visualization tool (Kang et al. 2016).

Example of a more complex figure and figure caption in a Results subsection, which aim to explain the advantages of a new visualization tool (Kang et al. 2016).

 

Example of a general schematic “Methods Overview” subsection figure in the Results section (Mangul et al. 2016).

Example of a general schematic “Methods Overview” subsection figure in the Results section (Mangul et al. 2016).