Human Genomics

Currently, with large scale human genomic sequencing, a wealth of information has become available for use in genetic analysis like GWAS, or inferring demographic histories of populations. However, the quality of this information varies across different reference databases. Also, population specific databases are being generated to be able to obtain more specific information for these studies. Generation of these reference databases is important for performing analysis. In addition, assessing these databases for quality is important to understand which applications a particular dataset can be used for, and the corresponding reliablity of any analysis carried out in this manner. The lab is interested in the generation and analysis of such databases, in collaboration with various experimental groups in academia and industry. In particular, the lab is interested in the GenomeAsia 100K Consortium, which is a project to sequence 100,000 Asian individuals to help study of medical questions specific to the Asian population, also from the perspective of addressing demographic questions related to Asian populations. 


Baboon Genetics

Baboons are ground-living monkeys native to Africa and the Arabian Peninsula.  Due to their relatively large size, abundance and omnivorous diet, baboons have increasingly become a major biomedical model system.  Baboon research has been facilitated by the creation (in 1960) and maintenance of a large, pedigreed, well-phenotyped baboon colony at the Southwest National Primate Research Center (SNPRC) and an ability to control the environment of subjects in ways that are obviously not possible in human biomedical studies.  For example, baboons have been used to study the effect of diet on cholesterol and triglyceride levels in controlled experiments where all food consumed is completely controlled (McGill et al. 1981; Kushwaha et al. 1994; Singh et al. 1996).  In recent years, linkage studies in baboons have helped identify genetic regions affecting a wide range of phenotypes, such as cholesterol levels (Mahaney 1999; Kammerer et al. 2002), estrogen levels (Martin et al. 2001), craniofacial measurements (Sherwood et al. 2008), bone density (Havill et al. 2005a, 2005b) and lipoprotein metabolism (Rainwater 2009).  In addition, studies have also documented that the genetic architecture of complex traits in baboons can be directly informative about analogous traits in humans (Havill et al. 2005a; Cox et al. 2009).

The success of these and other studies have been mediated in part by recent advances in molecular genetics technologies.  In particular, the ability to cheaply genotype and/or sequence samples of interest has led to a revolution in genetic studies of the associations between genotype and phenotype.  While human genetic studies now routinely include the analyses whole-genome data from tens of thousands of samples, comparable studies in model organisms have lagged far behind.  Part of the reason for this is the lack of genetic resources in non-human species.  Large, international projects such as the Human Genome Project, International HapMap Project and the 1000 Genomes Project have provided baseline information on baseline sequences and genetic variation, and subsequent human genetic studies have utilized this background information.  Our project focuses on providing similar genomic resources for baboons (Papio anubis), with the hope that these resources will enable future high-resolution genotype-phenotype studies.  Specifically, we are generating the following genetic resources using samples from the SNPRC pedigreed baboon colony:

1)  A de novo genome assembly of P. anubis – This assembly uses a combination of 10X, BioNano, Oxford Nanopore and Phase Genomics data to produce single scaffolds covering all 20 autosomes and the X chromosome.  This assembly is of substantially higher quality than the public (but embargoed) baboon assembly Panu_3.0, and will be freely available for download here in Fall 2018.

2)  Whole-genome sequence data from >850 individuals from the SNPRC baboon colony.  These data contain a mix of high-coverage and low-coverage whole-genome sequences, available for download from NCBI BioProject PRJNA433868.

3)  High-resolution genetic map for P. anubis – We have used standard LD-based methods to generate a baboon genetic map.  Output files will be available here in Fall 2018.

4)  List of baboon SNPs and their frequencies – Using the data from (1) and (2), we are compiling a list of common baboon SNPs and their frequencies.  Mapping and variant calling relative to the new baboon genome assembly is ongoing and is expected to be finished in early 2019.


Macaque Genetics

Several species of macaques (genus Macaca) are widely used in biomedical research.  In addition, macaques are also a good model for studying speciation and hybridization, due to the apparent history of isolation followed by secondary contact (and thus the large number of extant hybrid zones).  Our interests focus on (1) using captive macaque colonies housed at the National Primate Research Centers as natural models of human disease; (2) characterizing the evolution of PRDM9 and fine-scale recombination rates across a range of closely related species; and (3) studying the patterns of extant and historical admixture between closely related macaque species.

Our work with (1) concentrates on the Japanese macaque (M. fuscata) colony housed at the Oregon National Primate Research Center.  We study the genetic basis of Japanese Macaque Encephalomyelitis (JME), a spontaneous demyelinating disease that is a natural analogue to Multiple Sclerosis, and an autosomal dominant Drusen phenotype that is analogous to Doyne Honeycomb Muscular Dystrophy.  We have generated a de novo fuscata genome assembly, using a combination of 10X Genomics and BioNano Genomics data, as well as high-coverage whole-genome sequence data from 50 Japanese macaques.  Additional sequencing of colony members is planned over the coming years.

For (2), we are utilizing publicly available genome sequences from M. mulatta and M. fascicularis, in combination with whole-genome sequence data we have generated from M. fuscata (described above) and M. nemestrina (NCBI BioProject PRJNA507022).  We also generated a de novo genome assembly of M. nemestrina (using 10X and BioNano data) for this work.

Finally, we are sequencing single representatives from a wide range of macaque species to look at broader patterns of admixture and hybridization in the genus.  We are also analyzing publicly available rhesus data, in combination with our M. fuscata sequences, to study in greater depth the patterns of admixture between these two species.