Integrating Genomics and Structural Biology Reveals Mechanisms of Gene Regulation
Our current knowledge of genome function is the result of sequence-based data in the form of one-dimensional strings of letters. However, DNA-binding proteins recognize the double helix as a three-dimensional object. Therefore, an understanding of transcription factor (TF) binding specificity must ultimately include DNA shape. The sequence-structure relationship in DNA is highly degenerate, and different nucleotide sequences can give rise to the same structure, while single nucleotide sequence variants sometimes change DNA shape over a region of several base pairs. DNA methylation is another way of altering DNA shape. To explore these effects on a genomic scale, we developed methods for the high-throughput prediction of DNA shape features. We used these structural features to augment nucleotide sequence in binding specificity models derived from statistical machine learning approaches and learned in vitro DNA binding specificity models from high-throughput binding assays. Based on data for many TFs from diverse protein families, we demonstrated that shape-augmented models are generally more efficient than existing sequence-based models in terms of accuracy, number of features, and computation time. Our models provide information on the importance of specific DNA sequence and shape features and thus reveal TF family-specific readout mechanisms and better explain why a given TF binds in vivo to a specific genomic target site.
About Dr. Rohs:
The main goal of my current research is to integrate two areas of studies that have developed along parallel lines, largely disconnected from each other: DNA sequence and structure. While sequence research analyzes DNA in a high-throughput manner and on a genome wide basis, structure research provides three-dimensional information on DNA and proteins at atomic resolution. My postdoctoral research provided me with an extensive training in DNA and protein-DNA structure analysis and prediction. In my own laboratory, we are currently developing new methodologies for the high-throughput prediction of DNA shape and its role in transcription factor bindding. The ability to predict DNA structure on a genomic scale will change how sequence data is analyzed.
M.S. Physics, Humboldt Universität Berlin, Germany, 1997
Ph.D. Biochemistry, Freie Universität Berlin, Germany, 2003
Other Degree Business, Columbia University, New York, USA, 2009
Rohs Lab Website