Data-driven design is making headway into a number of application areas, including protein, small-molecule, and materials engineering. The design goal is to construct an object with desired properties, such as a protein that binds to a target more tightly than previously observed. To that end, costly experimental measurements are being replaced with calls to a high-capacity regression model trained on labeled data, which can be leveraged in an in silico search for promising design candidates. The aim then is to discover designs that are better than the best design in the observed data. This goal puts machine-learning based design in a much more difficult spot than traditional applications of predictive modelling, since successful design requires, by definition, some degree of extrapolation---a pushing of the predictive models to its unknown limits, in parts of the design space that are a priori unknown. In this talk, I will anchor this overall problem in protein engineering, and discuss our emerging approaches to tackle it.
Seminar on Theoretical Machine Learning
Machine learning-based design (of proteins, small molecules and beyond)
University of California, Berkeley
We welcome broad participation in our seminar series. To receive login details, interested participants will need to fill out a registration form accessible from the link below. Upcoming seminars in this series can be found here.
Remote Access Only - see link below