We have published numerous blog posts on managing scientific labs, writing papers, and strategizing a graduate career. Articles presenting our advice on these subjects have become the top-viewed posts on our website. Moving forward, we will be organizing the ZarLab website to feature this content. We believe that the practices and strategies described in these posts have greatly improved our productivity and advanced our careers. While our posts are written with Bioinformatics in mind, the concepts can be applied broadly to careers across STEM fields.
Here we present a summary of the posts that provide advice to scientists.
- Writing Tips: An Authorship Policy that Maximizes Collaboration
- Writing Tips: Why we Publish Methods Papers
- Writing Tips: Results Subsections
- Writing Tips: Methods Overview
- Writing Tips: Introduction
- Writing Tips: How we Edit
- Writing Tips: Getting Organized (and Staying that Way)
- Writing Tips: Motivation (or the Lack of It)
- Writing Tips: Overcoming Writer’s Block
Advice for Scientists
- UCLA Bioinformatics: The Philosophy of the Training Environment and Programs
- UCLA Bioinformatics: The Philosophy of the Ph.D. Program
- UCLA Bioinformatics: The Philosophy of the Undergraduate Program
- UCLA Launches CGSI with Inaugural Summer Programs
- Video Tutorial: Serghei Mangul’s Introduction to UNIX Workshops
- Video Tutorial: An Introduction to Read Mapping and Next Generation Sequencing
- Learning Bioinformatics @ UCLA: Finding Bioinformatics Research Opportunities
- Learning Bioinformatics @ UCLA: What Courses Should I Take?
- Learning Bioinformatics @ UCLA: The Undergraduate Bioinformatics Minor
ZarLab Thesis Defenses
- Thesis Defense: Dr. Farhad Hormozdiari
- Thesis Defense: Dr. Jong Wha (Joanne) Joo
- Thesis Defense: Dr. Zhanyong (Jerry) Wang
- Thesis Defense: Dr. Eun Yong Kang
- Thesis Defense: Dr. Jae Hoon Sul
- Thesis Defense: Dr. Nick Furlotte
Michael Bilow and Eleazar Eskin, together with Fernando Crespo, Zhicheng Pan, and Susana Eyheramendy, recently released a novel method for accurate joint modeling of clinical phenotype and disease status. This approach incorporates a clinical phenotype into case/control studies under the assumption that the genetic variant can affect both.
Genetic case-control association studies have found thousands of associations between genetic variants and disease. Most studies collect data from individuals with and without disease, and they often search for variants with different frequencies between the groups. Jointly modelling clinical phenotype and disease status is a promising way to increase power to detect true associations between genetics and disease. In particular, this method increases potential for discovering genetic variants that are associated with both a clinical phenotype and a disease.
However, standard multivariate techniques fail to effectively solve this problem because their case-control status is discrete and not continuous. Standard approaches to estimate model parameters are biased due to the ascertainment in case/control studies. We present a novel method that resolves both of these issues for simultaneous association testing of genetic variants that have both case status and a clinical covariate.
In our paper, we show the utility of our method using data from the North Finland Birth Cohort (NFBC) dataset. NFBC enrolled almost everyone born in 1966 in Finland’s two most northern provinces. The NFBC dataset consists of 10 phenotypes and genotypes at 331,476 genetic variants measured in 5,327 individuals. We focus our study on the LDL cholesterol and triglyceride levels phenotypes.
Our evaluation strategy analyzes a subset of the NFBC data and compares what we discover here to what was discovered in the full NFBC dataset—which we treat as the gold standard. We compare the performance of our novel approach to three other methods: (1) the single univariate test applied to the disease status, (2) the multivariate approach applied to the disease status and the clinical phenotype modeled as a multivariate normal distribution, and (3) the liability threshold model treating the clinical phenotype as a covariate.
Using the univariate approach, the p-values are much weaker in comparison to those observed in the full NFBC dataset. Running the multivariate approaches, incorporating the triglyceride levels phenotypes, increased power (i.e., more significant p-values than SNPs).
Our method has the highest power in all scenarios. The advantage of our method is greater when there are substantial amounts of selection bias compared to lower amounts of selection bias. Our method is even more powerful when the correlation between the clinical covariate and the disease liability is lower, because we explicitly estimate the underlying liability using all of the data.
For more information, see our paper in Genetics: http://www.genetics.org/content/early/2017/01/27/genetics.116.198473
The software implementing the methods described in this paper was developed by Fernando Crespo and is available at: http://genetics.cs.ucla.edu/multipheno/ and
The full citation to our paper is:
Bilow, M., Crespo, F., Pan, Z., Eskin, E. and Eyheramendy, S., 2017. Simultaneous Modeling of Disease Status and Clinical Phenotypes to Increase Power in GWAS. Genetics, pp.genetics-116.
(This post is authored by Eleazar Eskin.)
For over a decade, I’ve been involved in graduate admissions for both the UCLA Bioinformatics Ph.D. Program and the UCLA Computer Science Ph.D. Program. Each year, we admit a group of prospective students; some come to UCLA, and some go elsewhere. Students admitted to multiple programs face difficult decisions when choosing where to begin graduate studies. The factors involved in their decision are varied and often independent of academic considerations. For example, when I was an undergraduate, I chose to attend the Computer Science Ph.D. Program at Columbia University—mostly because I wanted to live in New York City!
However, when considering academic issues, determining which graduate program is best for you is not so straightforward. In this blog post, I provide some advice on how to choose a graduate program. While my focus is on Bioinformatics, the general concept applies to any program in the sciences. In particular, you should consider the following questions.
Whose lab can I join? By far the most important factor affecting your Ph.D. education will be the lab you join. A great advisor at a lower-ranked institution will lead to much better student outcome than a crappy advisor at Stanford, Harvard, or MIT. Great advisors will, among other things, develop expectations for their graduate students that are in line with the students’ career goals and provide sufficient structure and resources for the students to work toward achieving these goals.
However, choosing a graduate program and institution for a single advisor is not a great idea, because most students do not end up in the lab they thought they would join when they applied to the program. An ideal graduate program will have multiple faculty that could be a great fit for you. Insights on determining which faculty could be a great fit for you are in my blog post on how to choose a graduate advisor.
Take-home point: Choosing your lab is much more important than choosing your graduate program—or even the name-recognition value of the institution you will attend.
How do I get an advisor, and is it easy to switch advisors? Graduate programs relevant to Bioinformatics are usually set up in one of two ways. In both cases, the host faculty (or principle investigator) of the lab in which you perform research is also your advisor. One approach is a rotation system, where students initially try three labs and ultimately join one for research during their final years in the program. An advantage of the rotation approach is that you get to try out three faculty before making a decision on which lab to join while you complete the degree.
Other programs, such as the UCLA Computer Science Ph.D. Program, have direct admissions. Here, students join their primary research lab beginning in their first term. In direct admissions programs, it is important for you to know how easy or hard switching advisors would be if the lab turns out to be a poor match. For example, you would want to know if program funding is tied to the individual host faculty—or if it is tied to the department. You can typically change advisors easily if your funding is tied to the department. When funding is tied to the PI, it can be more difficult to switch advisors.
Take-home point: Finding out the advisor selection process at your prospective programs can help you decide which program would offer you the most flexibility.
What training will I get, and what will I learn? In addition to working with a faculty on research, every graduate program has a substantial training component. During your first year, you will spend most of your time completing coursework in addition to your research. For some Bioinformatics programs, the primary courses are “seminar” type courses in which multiple faculty present their research each week. Seminars provide a useful, important survey of research on campus, but they have little educational value. Other programs feature an integrated curriculum with specific pedagogical goals. These types of courses will help you learn a lot more skills and applications.
Some institutions may provide additional training opportunities through coursework offered by other departments. For example, at UCLA, Bioinformatics Ph.D. students can gain skills in computer languages and data analysis from the Departments of Computer Science, Biomathematics, and Ecology and Evolutionary Biology, among others. You should find out if extra-departmental courses, such as advanced courses in statistics and computer science, would be available to you on the prospective campuses. These types of opportunities are often not available in graduate programs that are a part of medical schools and lack access to a basic campus.
Take-home point: Campus-wide course options and quality—both within and outside of the home department—are important considerations when comparing prospective programs.
How much activity in Bioinformatics is happening on campus? Part of your education as a Ph.D. student comes from attending seminars held by visiting scholars and guest speakers. Some department have a very active speaker series and organize workshops for faculty and students, while others may hold seminars and workshops relevant to Bioinformatics less frequently. Bringing in scholars from other institutions is important, because these events expand your training and expose you to ideas and methods not represented at your campus. If you are planning on attending an institution with a limited number of faculty involved in Bioinformatics, you will likely have fewer relevant seminars and activities available to you.
Take-home point: Current graduate students will be able to tell you how many relevant seminars, workshops, and other events can be expected from your prospective programs.
How collaborative is the Bioinformatics community? Collaboration with other groups on campus exposes you to other faculty and students. Multiple groups rarely collaborate at some institutions, while intra-lab collaboration is extremely common on other campuses. Intra-lab collaboration will greatly broaden your experience as a graduate student. When you visit your prospective program, you may not know which lab you will join, but you can get a feel of whether or not the institution and community is collaborative.
Take-home point: Checking if there are papers authored jointly between multiple groups of interest will help you assess the level of collaboration at your prospective programs.
How big is the community of students and faculty? Informally interacting with other scholars in your research area is very important. Some of my closest colleagues today are individuals whom I met while I was a graduate student. You will get to know many of the students in your graduate program and in related programs; programs that have larger numbers of faculty and students, more departments of interest, and more workshops and other relevant events will provide more opportunity for you to establish these important relationships.
Take-home point: A large cohort of colleagues spanning multiple departments will give you more opportunities to interact (and have fun!) while in graduate school.
How likely am I to have funding for all my years as a graduate student? The vast majority of programs in the physical and life sciences fund their graduate students for the entire duration of their doctoral education. Full funding is critically important, because university tuition nowadays is very high. It is up to the program to provide these assurances to admitted graduate students, but make sure to ask questions about this if the admission offer letter you receive is not clear. Some programs require students to work teaching and/or research assistant jobs for a salary and tuition waiver, while others provide fellowship stipends. Most graduate programs offer a combination of fellowships and student jobs in their funding packages.
Take-home point: Make sure you ask about funding details when you are deciding among multiple prospective graduate programs.
These questions aren’t the only factors you should consider when deciding which graduate program to join. There are a host of personal issues that affect these choices, including what city you want to live for the next four to six years.
While this blog post focuses on the general issue of picking a graduate program, we at UCLA feel that our program compares very favorably along these lines. Our program has a tremendous number of new faculty who are at all career stages and invested in mentoring incoming graduate students. In general, students have a lot of agency in picking their advisor, even if they come to UCLA through a direct-admit program such as the Computer Science Ph.D. Program. Many students are co-advised between multiple faculty spanning different departments, an arrangement that increases a student’s breadth of research and engagement.
Our Bioinformatics Ph.D. curriculum tightly integrates computational and biological knowledge and provides students with training in methods development (Link to Grad Philosophy Blog Post). In addition, the Bioinformatics graduate curriculum is available to both Bioinformatics Ph.D. Program students and Computer Science Ph.D. Program students. Many Bioinformatics-relevant training camps, seminars, and workshops take place throughout the year at UCLA (for more information, read our blog post on UCLA Bioinformatics training environment).
This collaborative spirit is a hallmark of the UCLA campus, and many Bioinformatics faculty have co-authored with each other and myriad colleagues from other departments on campus. Each year, we welcome a cohort of nearly 20 new computational genomics students from the Bioinformatics Ph.D. Program, the Genetics and Genomics Ph.D. Program, and the Computer Science Ph.D. Program. These students typically take courses together and create a cohesive network of junior researchers with shared interests.
Our graduate students’ collaboration and comradery is evident. The Bioinformatics students organize an annual student-led retreat where they share research and build relationships in settings such as Catalina Island and Big Bear Lake. As for student life at UCLA, we have five (!) outdoor pools, perfect year-round weather, amazing global food culture, and beaches within a few miles of campus. The undergraduate students have caught on to how great a place it is to be. This year, UCLA became the first university to receive more than 100,000 applications for freshman admission!