Jerry Wang and Jae Hoon Sul, two lab alumni, published a paper introducing a new a two-stage model software for detecting associations between traits and pairs of SNPs using a threshold-based efficient pairwise association approach (TEPAA).  The method is significantly faster than the traditional approach of performing an association test with all pairs of SNPs.  In the first stage, the method performs the single marker test on all individual SNPs and selects a subset of SNPs that exceed a certain SNP-specific predetermined significance threshold for further consideration. In the second stage, individual SNPs that are selected in the first stage are paired with each other, and we perform the pairwise association test on those pairs.
The key insight of the approach is that the joint distribution is derived between the association statistics of single SNP and the association statistics of pairs of SNPs. This joint distribution provides guarantees that the statistical power of our approach will closely approximate the brute force approach. Then you can accurately compute the analytical power of our two-stage model and compare it to the power of the brute force approach. (See the Figure) Hence, the method chooses as few SNPs as possible in the first stage while achieving almost the same power as the brute force approach.
The power loss region of the threshold-based efficient pairwise association approach (TEPAA). The contour lines represent the probability density function of the multivariate normal distribution (MVN).  T1(subscript) is the threshold for the first stage.  Any SNP with a higher significance than T1 will be passed on to the second stage.  T2(subscript) is the threshold for significance of the pairwise test.  The area surrounded by the red rectangle corresponds to the power loss region.

The power loss region of the threshold-based efficient pairwise association approach (TEPAA). The contour lines represent the probability density function of the multivariate normal distribution (MVN). T1(subscript) is the threshold for the first stage. Any SNP with a higher significance than T1 will be passed on to the second stage. T2(subscript) is the threshold for significance of the pairwise test. The area surrounded by the red rectangle corresponds to the power loss region.

Jerry and Jae Hoon demonstrate the utility of TEPAA applied to the Northern Finland Birth Cohort (Rantakallio, 1969; Jarvelin et al., 2004).  From their analysis, they observe that the thresholds that control the power loss of the two-stage approach depend on the minor allele frequency (MAF) of the SNPs. In particular, more common SNPs can be filtered out with less significant thresholds than rare SNPs. In order to efficiently implement TEPAA using MAF dependent thresholds for each pair, we group the SNPs into bins based on their MAFs to apply the correct thresholds to each possible pair. After disregarding rare variants with MAF <  0.05, they categorize all common SNPs into nine bins according to their MAF, with step size 0.05. Each pair of SNPs would have two thresholds, one for each SNP in the first stage.  We precompute the first-stage thresholds for each combination of two MAFs in order to achieve 1% power loss,while achieving high cost savings. We sort the SNPs within each bin by their association statistics and use binary search to rapidly obtain the set of SNPs above a single threshold to efficiently implement the first stage of our method.

Read our full paper here:

Wang, Zhanyong; Sul, Jae Hoon; Snir, Sagi; Lozano, Jose; Eskin, Eleazar (2015): Gene-Gene Interactions Detection Using a Two-stage Model.. In: J Comput Biol, 2015, ISSN: 1557-8666. (Type: Journal Article | Abstract | Links | BibTeX)
What are the interesting computational ideas underlying a new computational method?  What are the intuitions behind the method?  How is the method related to other methods?  These are the key question that papers which describe new computational methods should be answering.
Unfortunately, most papers describing new computational methods don’t explicitly address these questions due to constraints of the journal styles.  Introduction of methods papers often have a only few sentences about the method.  The Methods section typically has many more details but has very little discussion of the underlying ideas.   Understanding what is interesting about a method is left completely to the readers imagination.  Often, the journals request that the Results section precede the Methods section which then makes understanding the results very difficult without the reader reading the sections of the paper out of order.  Authors can appeal to the journal to have the Methods section first, but this is also not a good solution since there are many details in the Methods such as descriptions of the datasets which take away from the flow of the paper.
In order to avoid these problems, in our papers, we make the first subsection of the Results section of the paper a “Methods Overview.”  In this section, we describe the method in terms of the high level ideas and typically include as a figure a small example which we utilize the help the reader understand the example.   The goal of this section is to give enough details that the readers can then follow the rest of the Results section without requiring looking at the Methods section.  A well written Methods Overview will make it much easier for the reader to follow the actual Methods section.
These sections and examples are designed to be self contained and should be in a language appropriate for a general audience.  In fact, some of the blog posts are almost verbatim copies of the Methods Overview sections of some of our recent papers.  For example, see these blog posts on GRAT and Genome Reassembly.
Another way to think of what to put in the Methods Overview section is what you would explain in a talk about the method.  Often presentations on computational methods have excellent slides showing intuitions and very clear examples.  The place to put that kind of material is in the Methods Overview.  Remember, in your paper you must give a compelling argument as to WHY your method is interesting. If your readers don’t understand the intuitions underlying your work, they will never appreciate it.
I’m sure you may be asking, “Isn’t this a little redundant?” What I’m proposing here may be a bit repetitive, with a methods overview section and a methods section later in the paper.  But they serve different purposes.  With a well written Methods Overview section, a reader can stop after the Results section and understand most of your paper.  The Methods section then only becomes important for someone who wants to understand all of the details.

In this blog post, I would like to “introduce” you to our introduction style. Writing the introduction is the most daunting part of the paper writing process, especially for students who are not native english speakers. To help structure the introduction writing process, in our lab we have developed a standard style or template for writing introductions. Since the majority of the papers that we write are papers that describe new computational methods, many of our papers naturally fit into this style. We usually publish our papers in Genetics journals which have very high standards of writing and are read by researchers with a wide range of backgrounds. The difference between a paper getting accepted and rejected is often determined by the clarity of the writing.

Our introduction style is a very specific formula that works for us but obviously there are other ways to structure an introduction and each experienced writer will have their own style. However, the truth is, you NEVER start out as a good writer and new writers need to start somewhere. It takes practice, consistency and effort to write well. If you are a new writer apprehensive about writing an introduction, we hope that this structure can help you.

Our introductions are typically four paragraphs long with each paragraph serving a specific role:
1. Context – First, it is important to explain the context of the research topic. Why is the general topic important? What is happening in the field today that makes this a valid topic of research?
2. Problem – Secondly, you present the problem . We typically start this paragraph with a “However,” phrase. Simple example: We have this awesome discovery in XYZ… However, using former methods it will take us 10 years to run the data. Each sentence in this paragraph should have a negative tone.
3. Solution – By this point, your readers should sympathize with how terrible this problem is and how there MUST be a solution (maybe a little dramatic, but you get my point). Paragraph three always starts with “in this paper” and a descritpion of what the paper proposes and how it solves the problem in the second paragraph.
4. Implication – The last paragraph in your introduction is the implication, which describes why your solution is important and moves the field forward. Typically, in this paragraph is where you summarize the experimental results and how they demonstrate that the solution solves the problem. This paragraph should answer the readers question of “so what?”.

An example of the 4 paragraph introduction style is in the following paper:

Mangul, Serghei; Wu, Nicholas; Mancuso, Nicholas; Zelikovsky, Alex; Sun, Ren; Eskin, Eleazar (2014): Accurate viral population assembly from ultra-deep sequencing data.. In: Bioinformatics, 30 (12), pp. i329-i337, 2014, ISSN: 1367-4811. (Type: Journal Article | Abstract | BibTeX)

Most of our other papers in their final form do not follow this format exactly.  But many of them in earlier drafts used this template and then during the revision process, added a paragraph or two expanding one of the paragraphs in the template.  For example, this paper expanded the implication to two paragraphs:

Kang, Eun Yong; Han, Buhm; Furlotte, Nicholas; Joo, Jong Wha; Shih, Diana; Davis, Richard; Lusis, Aldons; Eskin, Eleazar (2014): Meta-Analysis Identifies Gene-by-Environment Interactions as Demonstrated in a Study of 4,965 Mice. In: PLoS Genet, 10 (1), pp. e1004022, 2014, ISSN: 1553-7404. (Type: Journal Article | Abstract | Links | BibTeX)

and this paper expanded both the context and problem to two paragraphs each:

Sul, Jae Hoon; Han, Buhm; Ye, Chun; Choi, Ted; Eskin, Eleazar (2013): Effectively Identifying eQTLs from Multiple Tissues by Combining Mixed Model and Meta-analytic Approaches. In: PLoS Genet, 9 (6), pp. e1003491, 2013, ISSN: 1553-7404. (Type: Journal Article | Abstract | Links | BibTeX)

For methods papers, sometimes what are proposing is an incremental improvement over another solution. In this case, moving from the context to the problem is very difficult without explaining the other solution. For this scenario, we suggest the following six-paragraph structure:
Problem 1 (the BIG problem)
Solution 1 (the previous method)
Problem 2 (Why does the previous method fall short?)
Solution 2 (“In this paper” you are going to improve Solution 1)

An example of 6 paragraph introductions where the 3rd and 4th paragraph were merged is:

Furlotte, Nicholas; Kang, Eun Yong; Nas, Atila Van; Farber, Charles; Lusis, Aldons; Eskin, Eleazar (2012): Increasing Association Mapping Power and Resolution in Mouse Genetic Studies Through the Use of Meta-analysis for Structured Populations.. In: Genetics, 191 (3), pp. 959-67, 2012, ISSN: 1943-2631. (Type: Journal Article | Abstract | Links | BibTeX)

There it is… the beginning to a great paper (at least we like to think so!). Will this work for you? Have other ideas? Let us know in the comments below!

This is an example of our edits.  The red marks are directly edits and the blue are high level comments.

This is an example of our edits. The red marks are directly edits and the blue are high level comments.

In our last writing post, we talked about how our group of a dozen undergrads, four PhDs and three postdocs (not to mention our many collaborators) stays organized. This week we would like to focus on our paper writing process, and more specifically, how we edit.

Believe it or not, each one of our papers goes through at least 30 rounds of edits before it’s submitted to be published. You read that right… 30 rounds of edits. Each round is very fast with usually a day or two of writing, and we try to give back comments within a few hours of getting the draft. Because we are doing so many iterations, the changes from round to round often only affect a small portion of the paper. The writing process begins in week one of the project. This is because no matter how early we start writing, at the end of the project, our bottleneck is the paper is not finished even though all of the experiments are complete. For that reason, starting writing the paper BEFORE the experiments are finished (or even started) leads to the paper being submitted much earlier. Some people feel that they shouldn’t write the paper until they know how the experiments are finished so they know what to say. I completely disagree with this position. I think it is better to start at least with the introduction, overview of the methods, the methods section, the references etc. If the experimental results are unexpected then the paper can be adapted to the results later. However, getting an early start on the writing substantially reduces the overall time that it takes to complete the paper.

To jump start the students writing, I sometimes ask them to send me a draft every day. We call this “5 p.m. drafts.” Just like we mentioned in our very first writing tips post, the best way to overcome writer’s block is to make writing a habit. What I find is that if I get a draft that is one day of work or a week of work from a student, it still needs the same amount of work. This is what motivates our writing many many many iterations.

This is an early edit where we did a lot of rewording. For this, we use notes or text boxes.

This is an early edit where we did a lot of rewording. For this, we use notes or text boxes.

Editing in our lab is certainly not done in red ink on paper. That would be WAY too difficult to coordinate the logistics. The way we do it is via a PDF emailed from the students. I edit it on my iPad using the GoodReader app, which can make notes, include text in callouts, draw diagrams and highlight directly on the document. GoodReader also lets me email the marked PDF back to the students directly. It typically takes 30 minutes to an hour to make a round of edits. This inexpensive iPad app has increased our workflow and decreased our edit turnaround significantly. Keep in mind that I don’t always need to make a full pass on the paper, but just give enough comments to keep the student busy during the next writing period (which can be one day).

Since my edits are marked on the PDF, the students needs to enter the edits into the paper. This is great for them as they get to see the edits and this improves their writing. Previously, when I would make edits on the paper directly, they wouldn’t be able to see them. When I edit, I make direct changes in red and general comments in blue.

Like our method? Let us know!

Most methods that try to understand the relationship between an individual’s genetics and traits analyze one trait at a time. Our lab recently published a paper focusing on analyzing multiple traits together. This subject is significant because analyzing multiple traits can discover more genetic variants that affect traits, but the analysis methods are challenging and often very computationally inefficient. This is especially the case for mixed-model methods which take into account the relatedness among individuals in the study. These approaches both increase power and provide insights into the genetic architecture of multiple traits. In particular, it is possible to estimate the genetic correlation that is a measure of the portion of the total correlation between traits that is due to additive genetic effects.

In our recent paper, we aim to solve this problem by introducing a technique that can be used to assess genome-wide association quickly, reducing analysis time from hours to seconds. Our method is called a Matrix Variate Linear Mixed Model (mvLMM) and is similar to the method recently developed by Mathew Stephen’s group ((22706312)). Our method is available as a software which works together with the pylmm software that we are developing on mixed models which is available at An implementation of this method is available at

We demonstrate the efficacy of our method by analyzing correlated traits in the Northern Finland Birth Cohort ((19060910)). Comparing to a standard approach ((22843982); (22902788)), we show that our method results in more than a 10-fold time reduction for a pair of correlated traits, taking the analysis time from about 35 minutes to about 2.5 minutes for the cubic operations plus another 12 seconds for the iterative part of the algorithm. In addition, the cubic operation can be saved so that it does not have to be re-calculated when analyzing other traits in the same cohort. Finally, we demonstrate how this method can be used to analyze gene expression data. Using a well-studied yeast dataset ((18416601)), we show how estimation of the genetic and environmental components of correlation between pairs of genes allows us for to understand the relative contribution of genetics and environment to coexpression.

One of the key ideas of our approach is to represent the multiple phenotypes as a matrix where the rows are individuals and the columns are traits. We then assume the data follows a “matrix variate normal” distribution where we define a covariance structure on the trait among the rows (individuals) and columns (traits). The use of the matrix variate normal is the key to making our algorithm efficient.

The full paper about mvLMM is below:

Furlotte, Nicholas; Eskin, Eleazar (2015): Efficient Multiple Trait Association and Estimation of Genetic Correlation Using the Matrix-Variate Linear Mixed-Model.. In: Genetics, 2015. (Type: Journal Article | Abstract | Links | BibTeX)


**Update** Since publishing, it has been brought to our attention there is related work published by Karin Meyer in 1985 (which cited earlier work by Robin Thompson from 1976) we did not cite. If our method interests you, please also take a moment to review the following paper:

Meyer, (1985): Maximum Likelihood Estimation of Variance Components for a Multivariate Mixed Model with Equal Design Matrices. In: Biometrics, 41 (1), pp. pp. 153-165, 1985, ISSN: 0006341X. (Type: Journal Article | Abstract | Links | BibTeX)


We are very happy to announce the US-Israel Binational Science Foundation (BSF) in partnership with the Gilbert Foundation are renewing support of our collaboration with Eran Halperin’s group in Tel Aviv University. This is our labs oldest active collaboration which began in 2001 when Professor Eskin met Eran Halperin at the RECOMB conference.

Our first joint project in genetics was a collaboration with Eran Halperin in 2003 (who was in Berkeley, CA at the time) on a problem called haplotype phasing and led to the software HAP ((14988101)). That led us to become involved in the first whole-genome map of human variation, which was published on the cover of Science in 2005 ((15718463)). We have continued to work closely and publish together because we have very complementary backgrounds. We came from machine learning and Eran come from theory. We have many joint projects, regular conference calls and visits, and collaborations between our students. One of my Ph.D. students was a post doc in Professor Halperin’s group and one of his post docs was recruited to UCLA as a faculty member.

Many of our most important research contributions have been jointly authored papers. This includes our work on characterizing genetic diversity using spatial ancestry analysis (SPA-(22610118)) and genotyping common and rare variants in very large population studies using overlapping pool sequencing, which can be used for the detection of cancer fusion genes from RNA sequences ((21989232)).

Thanks to the additional funding from BSF, we are expanding our current goals to address the problem of analysis of genetic data in conjunction with other data types such as epigenetic data (changes to the DNA along one’s lifetime) and RNA expression. There is strong evidence that these additional signals can provide more insights to the mechanisms of the disease, for example, epigenetic changes have been shown to be strongly related to certain diseases and environmental effects.

Further, the project enables an exchange of ideas and collaborations between not only myself and Eran but also between our students. Everyone involved benefits from this collaboration of Israeli and American scientists. This is our first BSF project and we are very grateful for the support of our collaboration.

To read the full article on our collaboration and the BSF, please click here.


Up to this point, I’m sure most of you are saying, “That’s great, but what about YOUR lab? What do YOU do?”

Following the advice in the book How to Write a Lot by Paul Silvia (see our blog entry about the book here), I (and everyone else in the lab) set aside time exclusively for writing.  Given that at any time there are over a dozen papers in various states written in the lab, how to allocate that time across the different projects is not that obvious.  This piece of advice is probably more appropriate for someone running their own lab, and not a student.
    What I do is inspired by the book’s advice to create a priority list of our writing projects based on how close each paper is to being completed.  Our lab has been tracking our papers and projects in Evernote monthly since October 2012 and continued to the present.  Overall, this approach, as well as setting aside dedicated time for writing, has significantly increased our lab’s overall productivity. We finish our papers much faster and spend less time being “stuck” without making much progress for long periods of time.
      Here is exactly how we organize our Evernote notebook.  Each month I create a new note (this month’s note was called “Paper Organization February 2015”).  It starts as a copy of the previous month’s note and is updated as things change throughout the month.
        The Evernote document has several lists of papers in order to how close they are to be completed.  Each paper entry in the list has a short title as well as the key student authors working on the paper.
          Submitted Papers:
          These papers are currently under review.  They are in this list because we don’t need to do any actual writing work, but periodically, we should check with the editors to see what is going on with the review process.  In the note, I keep track of where the paper is submitted.  Even when a paper is accepted,  I still keep it on this list until it appears in print and in Pubmed.  This way we can keep track of the paper through the proof editing process, uploading copyright forms, etc.  The reason these papers are listed first is that it only takes a few minutes to check in to see if anything needs to be done with any of these papers, but if something needs to be done, it is usually urgent.
            Revise and Resubmit Papers:
            This category is to track the papers when they have come back from review.  Regardless of whether the paper was accepted/rejected or whether or not the journal is willing to review another version, what we need to do is revise the paper taking into account the feedback and get it resubmitted as quickly as possible.  If the journal is willing to take a revision, then we also need to write the response to reviewers.  Since these papers are so close to being completed and published, any paper in this category takes priority over the remaining. During my allocated writing time, I usually spend the time writing the response to the reviewers and helping organize with the students what edits need to be made to address the reviews.
              Active Papers:
              This category keeps track of any paper that is currently being written by someone in the lab as their primary project.  I check in on these papers regularly and hopefully whenever my scheduled writing time comes around, I have a draft of one of these papers from a student who works on it, and I can make a pass on the paper and send back the edits.  If I don’t have any edits, I have the list of the students who I can send a reminder to ask for them.
                Future Papers:
                This category keeps track of the papers in the lab that we plan to work on or were working on before but the student who was working on the paper is no longer pushing it forward.  The reason we keep them separate from the Active Papers category is to keep it from distracting us when we are setting our writing priorities.  Anything in this category isn’t being actively pursued.
                  A few other categories that we have experimented with over the years is keeping track of “Collaborator’s Papers” where we are involved in the analysis, keeping track of “Grants” that we are writing, and keeping track of “Collaborator’s Grants” where we are responsible for contributing sections.
                    Our lab is pretty big right now and currently, we have eight submitted papers, seven papers we are revising after reviews and 14 papers which are currently being actively written by a student. Many of these papers will be completed and published in the next six months, but for a select few, we may be working on them for the next two years. Unfortunately, this is typical, as a paper which was just published from our lab was originally submitted for the first time in December 2012.  Keeping track of these papers in this way helps us keep organized and to prioritize our efforts.
                      Have any methods that work for you? Would you like to comment on what you’ve read so far? We’d love to hear from you!

                      Tags: , , , , ,

                      In our last post we wrote about how to overcome writer’s block and the fear of writing. So now you’re on a schedule, and you’re ready to tackle this “writing thing.” You wake up, coffee and computer in tow, but there’s just one problem: You still can’t write! What gives?!

                      In his book, How to Write a Lot, Paul Silva, PhD acknowledges that academic writing doesn’t get easy the moment you get on a schedule. (Silva, 2007) Before you were full of adrenaline and motivated by impending deadlines. Now that you are writing a few times a week, you aren’t in the this anxiety-laden “write of be written off” state anymore. According to Silva, there are three steps to getting your writing juices flowing:

                      1.  Set goals.
                      2.  Determine priorities.
                      3.  Track your progress.

                      Let’s start with goals. Clear and concise goals in themselves should be motivating. Goals give you a plan of action, a sense of direction and a deadline. What do you want to write about? What projects are you working on? Are there some papers that need revising? At first, make an exhaustive list of everything you would like to accomplish. Secondly, organize it into a list you can conquer. Break this plan of action into monthly, weekly and daily goals.

                      This takes us to Silva’s second phase of finding your motivation: determine priorities. With some writing projects, there are not set deadlines. Our lab is constantly developing new software and performing research projects. Some projects take weeks, months or even get revised over a period of a few years! The research most often comes before the writing. There are, however, those moments when we write grant proposals. If we miss the deadline, we get none of the funding. Writing assignments like these definitely take more priority the closer we get to the due date.

                      The third and final step to finding your motivation is tracking your progress. What better way to see how far you’ve come and the work ahead than to keep inventory of your writing. Behavioral research show that self-observation alone can cause the desired behaviors (Korotitsch & Nelson-Gray, 1999), in this instance writing. If you keep yourself accountable, whether that means in your planner, on your phone or with a wonderful spreadsheet we all love so much (only slightly kidding– every plan deserves a good spreadsheet.), you are more likely to stick to your schedule and meet your goals.

                      In our lab, we do this through a systematic process which we will reveal in our next blog post. We have records that date back nearly three years of every project we have ever started, finished and everything in between. We have sections for published works, active papers, grants, collaborations and future research projects.

                      Check us out next week for an outline of how our lab has reached writing success.

                      Hope this helps! Give it a shot and let us know what you think in the comments below.

                      Cited publications:

                      Korotitsch, W.J., & Nelson-Gray, R. O. (1999). An overview of self-monitoring research in assessment and treatment. Psychological Assessment, 11, 415-425.

                      Silva, P.J. (2007). How to Write a Lot. 29-40.

                      Interested in obtaining a copy? Here’s a link to Amazon.

                      Tags: , , , , , , ,

                      Many who write regularly know what it’s like to be at a loss for words. Some days we can churn out ten pages and others we struggle to write ten sentences. Writing is hard, which is why it is intimidating to a lot of people, whether you’re a student or you’ve been publishing papers for years. There can be a dozen reasons why we can’t find the right words: can’t find the time, don’t feel inspired, too many distractions…

                      The key to developing great writing is all in the habit of writing frequently. Writing must be intentional. If you wait for the world to provide you with the perfect conditions to write (Spring Break, perhaps?), you won’t be doing much writing at all. Instead of finding time to write, you must MAKE time to write. Create a schedule and make writing a productive part of your day. A draft is never perfect the first, second or even eighth time it is written, but I can assure you it gets better every time.

                      This may sound like a stretch, I know. The rebuttals are already coming to mind: I really don’t have time. I have a busy schedule. I need to escape my routine to write. My life is sooo unpredictable.

                      What’s the worst thing that can happen if you give this a shot for the next three weeks? Set aside a time, at least a few times a week, to focus and write. Making writing intentional has to be a better option than writing your paper at the last minute on a Saturday skipping meals on no sleep, right?

                      So here is my challenge to you: get off the Internet, silence your phone and start writing!


                      Jerry Wang defended his thesis on September 8, 2014 in 4760 Boelter Hall.

                      His thesis topic was Efficient Statistical Models For Detection And Analysis Of Human Genetic Variations. The video of his full defense can be viewed on the ZarlabUCLA YouTube page here.


                      In recent years, the advent of genotyping and sequencing technologies has enabled human genetics to discover numerous genetic variants. Genetic variations between individuals can range from Single Nucleotide Polymorphisms (SNPs) to differences in large segments of DNA, which are referred to as Structural Variations (SVs), including insertions, deletions, and copy number variations (CNVs).

                      First proposed was a probabilistic model, CNVeM, to detect CNVs from High-Throughput Sequencing (HTS) data. The experiment showed that CNVeM can estimate the copy numbers and boundaries of copied regions more precisely than previous methods.

                      Genome-wide association studies (GWAS) have discovered numerous individual SNPs involved in genetic traits. However, it is likely that complex traits are influenced by interaction of multiple SNPs. In his thesis, Jerry proposed a two-stage statistical model, TEPAA, to reduce the computational time greatly while maintaining almost identical power to the brute force approach which considers all combinations of SNP interactions. The experiment on the Northern Finland Birth Cohort data showed that TEPAA achieved 63 times speedup.

                      Another drawback of GWAS is that rare causal variants will not be identified. Rare causal variants are likely to be introduced in a population recently and are likely to be in shared Identity-By-Descent (IBD) segments. Jerry proposed a new test statistic to detect IBD segments associated with quantitative traits and made a connection between the proposed statistic and linear models so that it does not require permutations to assess the significance of an association. In addition, the method can control population structure by utilizing linear mixed models.


                      The full paper on topics covered in Jerry’s thesis defense can be found below:

                      Wang, Zhanyong; Sul, Jae Hoon; Snir, Sagi; Lozano, Jose; Eskin, Eleazar (2014): Gene-Gene Interactions Detection Using a Two-Stage Model. In: Research in Computational Molecular Biology, pp. 340-355, Springer International Publishing, 2014. (Type: Book Chapter | Abstract | Links | BibTeX)

                      Tags: ,

                      « Older entries

                      %d bloggers like this: