New Blog series – Memoirs of a TorchVision developer

Framework Machine Learning & Statistics Programming

I’m starting a new blog post series about the development of PyTorch’s computer vision library. I plan to discuss interesting upcoming features primarily from TorchVision and secondary from the PyTorch ecosystem. My target is to highlight new and in-development features and provide clarity of what’s happening in between the releases. Though the format is likely to change over time, I initially plan to keep it bite-sized and offer references for those who want to dig deeper. Finally, instead of publishing articles on fixed intervals, I’ll be posting when I have enough interesting topics to cover.

Disclaimer: The features covered will be biased towards topics I’m personally interested. The PyTorch ecosystem is massive and I only have visibility over a tiny part of it. Covering (or not covering) a feature says nothing about its importance. Opinions expressed are solely my own.

With that out of the way, let’s see what’s cooking:

Label Smoothing for CrossEntropy Loss

A highly requested feature on PyTorch is to support soft targets and add a label smoothing option in Cross Entropy loss. Both features target in making it easy to do Label Smoothing, with the first option offering more flexibility when Data Augmentation techniques such as mixup/cutmix are used and the second being more performant for the simple cases. The soft targets option has already been merged on master by Joel Schlosser while the label_smoothing option is being developed by Thomas J. Fan and is currently under review.

New Warm-up Scheduler

Learning Rate warm up is a common technique used when training models but until now PyTorch didn’t offer an off-the-shelf solution. Recently, Ilqar Ramazanli has introduced a new Scheduler supporting linear and constant warmup. Currently in progress is the work around improving the chain-ability and combination of existing schedulers.

TorchVision with “Batteries included”

This half we are working on adding in TorchVision popular Models, Losses, Schedulers, Data Augmentations and other utilities used to achieve state-of-the-art results. This project is aptly named “Batteries included” and is currently in progress.

Earlier this week, I’ve added a new layer called StochasticDepth which can be used to randomly drop residual branches in residual architectures. Currently I’m working on adding an implementation of the popular network architecture called EfficientNet. Finally, Allen Goodman is currently adding a new operator that will enable converting Segmentation Masks to Bounding Boxes.

Other features in-development

Thought we constantly make incremental improvements on the documentation, CI infrastructure and overall code quality, below I highlight some of the “user-facing” roadmap items which are in-development:

Francisco Massa is developing a prototype which uses FX to extract easily intermediate features from models. This is particularly useful for Object Detection, Segmentation and other vision tasks.
Philip Meier is investigating ways to revamp the Dataset API by supporting DataPipes.
Nicholas Hug started the preliminary work to add support of GPU JPEG Decoding for NVIDIA A100 devices.
Aditya Oke wrote a utility which allows plotting the results of Keypoint models on the original images.

That’s it! I hope you found it interesting. Any ideas on how to adapt the format or what topics to cover are very welcome. Hit me up on LinkedIn or Twitter.