A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data. Author Aaditya Rangan V, Caroline McGrouther, John Kelsoe, Nicholas Schork, Eli Stahl, Qian Zhu, Arjun Krishnan, Vicky Yao, Olga Troyanskaya, Seda Bilaloglu, Preeti Raghavan, Sarah Bergen, Anders Jureus, Mikael Landen, Bipolar Consortium Publication Year 2018 Type Journal Article Abstract A common goal in data-analysis is to sift through a large data-matrix and detect any significant submatrices (i.e., biclusters) that have a low numerical rank. We present a simple algorithm for tackling this biclustering problem. Our algorithm accumulates information about 2-by-2 submatrices (i.e., 'loops') within the data-matrix, and focuses on rows and columns of the data-matrix that participate in an abundance of low-rank loops. We demonstrate, through analysis and numerical-experiments, that this loop-counting method performs well in a variety of scenarios, outperforming simple spectral methods in many situations of interest. Another important feature of our method is that it can easily be modified to account for aspects of experimental design which commonly arise in practice. For example, our algorithm can be modified to correct for controls, categorical- and continuous-covariates, as well as sparsity within the data. We demonstrate these practical features with two examples; the first drawn from gene-expression analysis and the second drawn from a much larger genome-wide-association-study (GWAS). Keywords Female, Male, Humans, Gene Expression Profiling, Cluster Analysis, Algorithms, Breast Neoplasms, Databases, Genetic, Genome-Wide Association Study, Bipolar Disorder Journal PLoS Comput Biol Volume 14 Issue 5 Pages e1006105 Date Published 05/2018 ISSN Number 1553-7358 DOI 10.1371/journal.pcbi.1006105 Alternate Journal PLoS Comput. Biol. PMCID PMC5997363 PMID 29758032 PubMedPubMed CentralGoogle ScholarBibTeXEndNote X3 XML