Carnegie Mellon University
October 22, 2014

Gordon and Betty Moore Foundation Awards Data Science Grant to Carnegie Mellon Researcher

By Byron Spice

The Gordon and Betty Moore Foundation has announced the selection of Carl Kingsford, associate professor in Carnegie Mellon University’s Lane Center for Computational Biology, as one of 14 recipients Carl Kingsfordof its Moore Investigators in Data-Driven Discovery awards.

The five-year, $1.5 million grant will support Kingsford’s efforts to develop efficient new methods for searching the massive amounts of DNA and RNA sequencing data now available worldwide. Many insights into the most basic and important processes of life are awaiting discovery within that data. The databases are of such scale, however, that existing search methods are inadequate to fully explore them. For instance, the National Institutes of Health alone archive more than 2 quadrillion bases of sequence.

The latest Data-Driven Discovery Awards, totaling $21 million over five years, are unrestricted awards that will enable Kingsford and his fellow recipients to make a profound impact on scientific research by unlocking new types of knowledge and advancing new data science methods across a wide spectrum of disciplines.

The awards are part of a five-year, $60 million Data-Driven Discovery Initiative within the Gordon and Betty Moore Foundation’s Science Program. The initiative — one of the largest privately funded data scientist programs of its kind — is committed to enabling new types of scientific breakthroughs by supporting interdisciplinary, data-driven researchers.

“Science is generating data at unprecedented volume, variety and velocity, but many areas of science don’t reward the kind of expertise needed to capitalize on this explosion of information,” said Chris Mentzel, program director of the Data-Driven Discovery Initiative. “We are proud to recognize these outstanding scientists, and we hope these awards will help cultivate a new type of researcher and accelerate the use of interdisciplinary, data-driven science in academia.”

Kingsford directs a computational biology group that works on understanding protein interactions, gene expression, chromatic structure and viral evolution. His team collaborates across disciplines to create efficient computational methods that can deal with diverse and high-throughput datasets.

Earlier this year, he and a collaborator at the University of Maryland announced Sailfish, a new computational method that dramatically speeds up estimates of gene activity from RNA sequencing (RNA-seq) data. With this method, estimates of gene expression that previously took many hours can be completed in a few minutes, with accuracy that equals or exceeds previous methods.

“To me, Carl's work represents an outstanding example of the best approach to computational biology: careful framing of a biological problem followed by rigorous development and application of appropriate computer science methods,” said Robert F. Murphy, director of the Lane Center for Computational Biology. “As the volume and complexity of biomedical data increases exponentially, his scalable approaches and commitment to open source software will be critical to enabling new and clinically important discoveries.”

Kingsford is the recipient of an Alfred P. Sloan Research Fellowship in computational and evolutionary molecular biology and a National Science Foundation CAREER Award. He earned a Ph.D. in computer science from Princeton University and also trained at Duke University.