Covariate-modulated local false discovery rate for genome-wide association studies.

TitleCovariate-modulated local false discovery rate for genome-wide association studies.
Publication TypeJournal Article
Year of Publication2014
AuthorsZablocki RW, Schork AJ, Levine RA, Andreassen OA, Dale AM, Thompson WK
JournalBioinformatics
Volume30
Issue15
Pagination2098-104
Date Published2014 Aug 1
ISSN1367-4811
KeywordsAnalysis of Variance, Bayes Theorem, Computational Biology, False Positive Reactions, Genome-Wide Association Study, Humans, Polymorphism, Single Nucleotide
Abstract

MOTIVATION: Genome-wide association studies (GWAS) have largely failed to identify most of the genetic basis of highly heritable diseases and complex traits. Recent work has suggested this could be because many genetic variants, each with individually small effects, compose their genetic architecture, limiting the power of GWAS, given currently obtainable sample sizes. In this scenario, Bonferroni-derived thresholds are severely underpowered to detect the vast majority of associations. Local false discovery rate (fdr) methods provide more power to detect non-null associations, but implicit assumptions about the exchangeability of single nucleotide polymorphisms (SNPs) limit their ability to discover non-null loci.

METHODS: We propose a novel covariate-modulated local false discovery rate (cmfdr) that incorporates prior information about gene element-based functional annotations of SNPs, so that SNPs from categories enriched for non-null associations have a lower fdr for a given value of a test statistic than SNPs in unenriched categories. This readjustment of fdr based on functional annotations is achieved empirically by fitting a covariate-modulated parametric two-group mixture model. The proposed cmfdr methodology is applied to a large Crohn's disease GWAS.

RESULTS: Use of cmfdr dramatically improves power, e.g. increasing the number of loci declared significant at the 0.05 fdr level by a factor of 5.4. We also demonstrate that SNPs were declared significant using cmfdr compared with usual fdr replicate in much higher numbers, while maintaining similar replication rates for a given fdr cutoff in de novo samples, using the eight Crohn's disease substudies as independent training and test datasets. Availability an implementation: https://sites.google.com/site/covmodfdr/

CONTACT: : wes.stat@gmail.com

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

DOI10.1093/bioinformatics/btu145
Alternate JournalBioinformatics
PubMed ID24711653
PubMed Central IDPMC4103587
Grant ListR01 GM104400 / GM / NIGMS NIH HHS / United States
Category: 
IRG Funded