Jerry Wang defended his thesis on September 8, 2014 in 4760 Boelter Hall.
His thesis topic was Efficient Statistical Models For Detection And Analysis Of Human Genetic Variations. The video of his full defense can be viewed on the ZarlabUCLA YouTube page here.
In recent years, the advent of genotyping and sequencing technologies has enabled human genetics to discover numerous genetic variants. Genetic variations between individuals can range from Single Nucleotide Polymorphisms (SNPs) to differences in large segments of DNA, which are referred to as Structural Variations (SVs), including insertions, deletions, and copy number variations (CNVs).
First proposed was a probabilistic model, CNVeM, to detect CNVs from High-Throughput Sequencing (HTS) data. The experiment showed that CNVeM can estimate the copy numbers and boundaries of copied regions more precisely than previous methods.
Genome-wide association studies (GWAS) have discovered numerous individual SNPs involved in genetic traits. However, it is likely that complex traits are influenced by interaction of multiple SNPs. In his thesis, Jerry proposed a two-stage statistical model, TEPAA, to reduce the computational time greatly while maintaining almost identical power to the brute force approach which considers all combinations of SNP interactions. The experiment on the Northern Finland Birth Cohort data showed that TEPAA achieved 63 times speedup.
Another drawback of GWAS is that rare causal variants will not be identified. Rare causal variants are likely to be introduced in a population recently and are likely to be in shared Identity-By-Descent (IBD) segments. Jerry proposed a new test statistic to detect IBD segments associated with quantitative traits and made a connection between the proposed statistic and linear models so that it does not require permutations to assess the significance of an association. In addition, the method can control population structure by utilizing linear mixed models.
The full paper on topics covered in Jerry’s thesis defense can be found below:
In: Research in Computational Molecular Biology, pp. 340-355, Springer International Publishing, 2014.