Scaling Deep Learning

Many of the state-of-the-art results in deep learning are achieved using multiple GPUs. For some of the largest and most data-intensive ML models, it can take months or even years to train on one CPU or GPU. Training is sped up by scaling to large numbers of GPUs/TPUs. ...read more
Posted on Feb 21, 2023

Knowledge Distillation as Self-Supervised Learning

Self-supervised learning (SSL) methods have been shown to effectively train large neural networks with unlabeled data. These networks can produce useful image representations that can exceed the performance of supervised pretraining on downstream tasks. However, SSL is not effective with smaller models. This limits applications where computational power ...read more
Posted on Jan 11, 2022

Domain Adaptation

Machine learning performance depends on the dataset that it is trained on. Datasets are imperfect, so problems in the data affect the models. One type of problem is domain shift. This means that a model trained to learn a task on one dataset, may not be able to ...read more
Posted on Aug 09, 2021

Pruning Neural Networks

Much of the success of deep learning has come from building larger and larger neural networks. This allows these models to perform better on various tasks, but also makes them more expensive to use. Larger models take more storage space which makes them harder to distribute. Larger models ...read more
Posted on Sep 01, 2020