5 reasons to choose PyTorch for deep learning

TensorFlow still has certain advantages, but a stronger case can be made for PyTorch every day

Comments

PyTorch is definitely the flavour of the moment, especially with the recent 1.3 and 1.4 releases bringing a host of performance improvements and more developer-friendly support for mobile platforms.

But why should you choose to use PyTorch instead of other frameworks like MXNet, Chainer, or TensorFlow? Let’s look into five reasons that add up to a strong case for PyTorch.

Before we get started, a plea to TensorFlow users who are already typing furious tweets and emails even before I begin: Yes, there are also plenty of reasons to choose TensorFlow over PyTorch, especially if you’re targeting mobile or web platforms.

This isn’t intended to be a list of reasons that “TensorFlow sucks” and “PyTorch is brilliant,” but a set of reasons that together make PyTorch the framework I turn to first. TensorFlow is great in its own ways, I admit, so please hold off on the flames.

PyTorch is Python

One of the primary reasons that people choose PyTorch is that the code they look at is fairly simple to understand; the framework is designed and assembled to work with Python instead of often pushing up against it. Your models and layers are simply Python classes, and so is everything else: optimisers, data loaders, loss functions, transformations, and so on.

Due to the eager execution mode that PyTorch operates under, rather than the static execution graph of traditional TensorFlow (yes, TensorFlow 2.0 does offer eager execution, but it’s a touch clunky at times) it’s very easy to reason about your custom PyTorch classes, and you can dig into debugging with TensorBoard or standard Python techniques all the way from print() statements to generating flame graphs from stack trace samples.

This all adds up to a very friendly welcome to those coming into deep learning from other data science frameworks such as Pandas or Scikit-learn.

PyTorch also has the plus of a stable API that has only had one major change from the early releases to version 1.3 (that being the change of Variables to Tensors). While this is undoubtedly due to its young age, it does mean that the vast majority of PyTorch code you’ll see in the wild is recognisable and understandable no matter what version it was written for.

PyTorch comes ready to use

While the “batteries included” philosophy is definitely not exclusive to PyTorch, it’s remarkably easy to get up and running with PyTorch. Using PyTorch Hub, you can get a pre-trained ResNet-50 model with just one line of code:

model = torch.hub.load('pytorch/vision', 'resnet50', pretrained=True)

And PyTorch Hub is unified across domains, making it a one-stop shop for architectures for working with text and audio as well as vision.

As well as models, PyTorch comes with a long list of, yes, loss functions and optimisers, like you’d expect, but also easy-to-use ways of loading in data and chaining built-in transformations. It’s also rather straightforward to build your own loaders or transforms. Because everything is Python, it’s simply a matter of implementing a standard class interface.

One little note of caution is that a lot of the batteries that are included with PyTorch have been very biased towards vision problems (found in the torchvision package), with some of the text and audio support being more rudimentary. I’m happy to report that in the post-1.0 era, the torchtext and torchaudio packages are being improved upon considerably.

PyTorch rules research

PyTorch is heaven for researchers, and you can see this in its use in papers at all major deep learning conferences. In 2018, PyTorch was growing fast, but in 2019, it has become the framework of choice at CVPR, ICLR, and ICML, among others. The reason for this wholehearted embrace is definitely linked to our first reason above: PyTorch is Python.

Experimenting with new concepts is much easier when creating new custom components is a simple, stable subclass of a standard Python class.

And the flexibility offered means that if you want to write a layer that sends parameter information to TensorBoard, ElasticSearch, or an Amazon S3 bucket... you can just do it. Want to pull in esoteric libraries and use them inline with network training or an odd new attempt at a training loop? PyTorch is not going to stand in your way.

One thing holding PyTorch back a little has been the lack of a clear path from research to production. Indeed, TensorFlow still rules the roost for production usage, no matter how much PyTorch has taken over research.

But with PyTorch 1.3 and the expansion of TorchScript, it has become easy to use Python annotations that use the JIT engine to compile research code into a graph representation, with resulting speedups and easy export to a C++ runtime.

And these days, integrating PyTorch with Seldon Core and Kubeflow is supported, allowing for production deployments on Kubernetes that are almost (not quite) as simple as with TensorFlow.

PyTorch makes learning deep learning easy

There are dozens of deep learning courses out there, but for my money the fast.ai course is the best — and it’s free! While the first year of the course leaned heavily on Keras, the fast.ai team — Jeremy Howard, Rachel Thomas, and Sylvain Gugger — switched to PyTorch in the second iteration of the course and haven’t looked back. Though to be fair, they are bullish on Swift for TensorFlow.

In the most recent version of the course, you’ll discover how to achieve state-of-the-art results on tasks such as classification, segmentation, and predictions in text and vision domains, along with learning all about GANs and a host of tricks and insights that even hardened experts will find illuminating.

While the fast.ai course uses fast.ai’s own library that provides further abstractions on top of PyTorch (making it even easier to get to grips with deep learning), the course also delves deep into the fundamentals, building a PyTorch-like library from scratch, which will give you a thorough understanding of how the internals of PyTorch actually work. The fast.ai team even manages to fix some bugs in mainline PyTorch along the way.

PyTorch has a great community

Finally, the PyTorch community is a wonderful thing. The main website at pytorch.org has both great documentation that is kept in good sync with the PyTorch releases and an excellent set of tutorials that cover everything from an hour blitz of PyTorch’s main features to deeper dives on how to extend the library with custom C++ operators.

While the tutorials could use a little more standardisation around things like training/validation/test splits and training loops, they are an invaluable resource, especially when a new feature is introduced.

Beyond the official documentation, the Discourse-based forum at discuss.pytorch.org is an amazing resource where you can easily find yourself talking to and being helped out by core PyTorch developers. With over fifteen hundred posts a week, it’s a friendly and active community.

And while discussion is more focused on fast.ai’s own library, the similar forums over at forums.fast.ai is another great community (with lots of crossover) that is eager to help newcomers in a non-gatekeeping manner, which sadly is a problem in many arenas of deep learning discussion.

PyTorch today and tomorrow

There you have it—five reasons to use PyTorch. As I said at the beginning, not all of these are exclusive to PyTorch versus competitors, but the combination of all of these reasons makes PyTorch my deep learning framework of choice.

There are definitely areas where PyTorch is currently deficient — e.g., in mobile, with sparse networks, and easy quantising of models, just to pick three out of the hat. But given the high speed of development, PyTorch will be a much stronger performer in these areas by year’s end.

A couple of further examples just to finish us out. First, PyTorch Elastic — introduced as an experimental feature in December — extends PyTorch’s existing distributed training packages to provide for more robust training of large-scale models.

As the name suggests, it does so by running on multiple machines with elasticity, allowing nodes to drop in and out of the training job at any time without causing the entire job to come crashing to a halt.

Second, OpenAI has announced it is adopting PyTorch as its primary development framework. This is a major win for PyTorch, as it indicates that the creators of GPT-2 — a state-of-the-art language model for question answering, machine translation, reading comprehension, and summarisation — believe that PyTorch offers them a more productive environment than TensorFlow for iterating over their ideas.

Coming in the wake of Preferred Networks putting its deep learning framework Chainer into maintenance mode and moving to PyTorch, OpenAI’s decision highlights how far PyTorch has come in the past two years and strongly suggests that PyTorch will continue to improve and gain users in the years to come.

If these big players in the AI world prefer to use PyTorch, then it’s probably good for the rest of us too.