The last couple of weeks were super busy in “PyTorch Land” as we are frantically preparing the release of PyTorch v1.10 and TorchVision v0.11. In this 2nd instalment of the series, I’ll cover some of the upcoming features that are currently included in the release branch of TorchVision.
Disclaimer: Though the upcoming release is packed with numerous enhancements and bug/test/documentation improvements, here I’m highlighting new “user-facing” features on domains I’m personally interested. After writing the blog post, I also noticed a bias towards features I reviewed, wrote or followed closely their development. Covering (or not covering) a feature says nothing about its importance. Opinions expressed are solely my own.
The new release is packed with new models:
Kai Zhang has added an implementation of the RegNet architecture along with pre-trained weights for 14 variants which closely reproduce the original paper.
I’ve recently added an implementation of the EfficientNet architecture along with pre-trained weights for variants B0-B7 provided by Luke Melas-Kyriazi and Ross Wightman.
New Data Augmentations
A few new Data Augmentation techniques have been added to the latest version:
Samuel Gabriel has contributed TrivialAugment, a new simple but highly effective strategy that seems to provide superior results to AutoAugment.
I’ve added the RandAugment method in auto-augmentations.
I’ve provided an implementation of Mixup and CutMix transforms in references. These will be moved in transforms on the next release once their API is finalized.
New Operators and Layers
A number of new operators and layers have been included:
I’ve updated our references to support Label Smoothing, which was recently introduced by Joel Schlosser and Thomas J. Fan on PyTorch core.
I’ve included the option to perform Learning Rate Warmup, using the latest LR schedulers developed by Ilqar Ramazanli.
Here are some other notable improvements added in the release:
Alexander Soare and Francisco Massa have developed an FX-based utility which allows extracting arbitrary intermediate features from model architectures.
Nikita Shulga has added support of CUDA 11.3 to TorchVision.
Zhongkai Zhu has fixed the dependency issues of JPEG lib (this issue has caused major headaches to many of our users).
In-progress & Next-up
There are lots of exciting new features under-development which didn’t make it in this release. Here are a few:
Moto Hira, Parmeet Singh Bhatia and I have drafted an RFC, which proposes a new mechanism for Model Versioning and for handling meta-data associated to pre-trained weights. This will enable us to support multiple pre-trained weights for each model and attach associated information such as labels, preprocessing transforms etc to the models.
I’m currently working on using the primitives added by the “Batteries Included” project in order to improve the accuracy of our pre-trained models. The target is to achieve best-in-class results for the most popular pre-trained models provided by TorchVision.
Philip Meier and Francisco Massa are working on an exciting prototype for TorchVision’s new Dataset and Transforms API.