Princeton University Machine Learning in Physics
Foundation Models for Science: What happens if we train large (language) models on scientific data?
In recent years, the fields of natural language processing and computer vision have been revolutionized by the success of large models pretrained with task-agnostic objectives on massive, diverse datasets. This has, in part, been driven by the use of self-supervised pretraining methods which allow models to utilize far more training data than would be accessible with supervised training. These so-called ``foundation models″ have enabled transfer learning on entirely new scales. Despite their task-agnostic pretraining, the features they extract have been leveraged as a basis for task-specific finetuning, outperforming supervised training alone across numerous problems especially for transfer to settings that are insufficiently data-rich to train large models from scratch. In this talk, I will show our preliminary results on applying this approach to a variety of scientific problems and speculate what are possible future directions.