Benefits of GPUs in Data Science and How I Taught Myself About Them

Benefits of GPUs in Data Science and How I Taught Myself About Them

By: Eugene Olkhov

As a Data Scientist who has done his own fair share of model training, I’m very familiar with spending long periods of time waiting for training to finish, especially as you increase your data size and / or model complexity.

When training on a local machine, these long wait times become inconvenient not just because of the wait itself, but also because you will often be forced to keep your computer running throughout the night, or even multiple nights in a row. Hopefully, your computer doesn’t randomly restart due to an update! Besides the issues of running overnight, you also face the problem of your machine slowing down during the day while you attempt to work on other things.

It was not until I started working on a computer vision problem, that I finally decided that I need another method for model training. Given the prevalence of face masks, I was working on training a model to detect when a face mask was worn improperly. Using my standard method of model training, the model would have taken over 24 hours to train. This is not a feasible training time especially when you consider needing to make tweaks to the model. Training times need to be relatively fast in order to properly iterate on the model.

So, what’s the solution? Clearly, as problems become more difficult, and models become more advanced, we can’t keep running things on our local machines, right? Sort of.

Cloud Computing

1_iTAEr_NVnAvPkLvz50bBmg.png

The easiest solution is to switch from your local machine, to a virtual one. You can create an account with Google Cloud Services (GCP), Microsoft Azure, Amazon Web Services (AWS), Digital Ocean, etc. Once you have an account, you can either setup a free virtual machine instance or a paid one. The free instance usually doesn’t have enough power to do any serious model training, but it’s good to use to get familiar with the service. Keep in mind, if this is your first time setting up an account, many services give you some free credit to use on the paid instances.

This solves the issue of needing to keep your local machine running, but also you can customize your virtual instance to be a lot more powerful than your local machine which will speed up your training time. You can also utilize a virtual GPU instance, which, as I will cover next, will speed up your training immensely. Of course with great power, comes a big price tag. The more powerful your virtual instances are (especially vGPUs), the more you’re going to pay. Of course, if this model is going to be part of your core business, then it’s just going to be part of your business expenses. However, what if this is your personal project, or maybe a proof-of-concept where you don’t want to spend that much money on resources (yet). In that case, setting up training on a local GPU might be worth your while.

Setting up a local GPU

The main caveat is you need to have your own NVIDIA GPU. You could also buy one, but depending on the GPU, it can get quite pricey. Of course, if you envision yourself using it a lot then it would be worth it. If you’re not sure you have an NVIDIA GPU on your local machine, you can check the device manager on Windows, or the “About This Mac” > Display if you’re using a Mac.

1_z0KvusePIlQ1jFVmUMVJbg.png

The nice thing is that you don’t actually need to have a high-end GPU. The GPU in my laptop is on the lower end, and I was still able to see large improvements in training speed.

Ideally, you would also want to set this all up on a Linux machine. The process appears to be relatively straight-forward and involves installing the relevant drivers from NVIDIA. However, I am working on a Windows machine so my process was a bit different.

To make things even more complicated, my workflow also consists of working out of a Linux-based Docker container using Docker Desktop which adds some complexity in the setup process.

Luckily, I was able to find this helpful post which explained the steps I needed to take in order for Docker to be able to communicate with my GPU. The key steps are as follows:

1_PfM0ep40dSepLHK56-xz9A.png

While initially the steps seemed to be quite simple, there was one step in particular which at first I overlooked: Windows Insider. As I quickly learned, this actually involved installing a developer preview version of Windows on my machine. At first, I was a little reluctant to do so given that it could affect stability / compatibility of my machine, but since all the other methods I tried failed, I decided to go for it.

At first, the update actually failed! My machine then reverted back to the original version of Windows that I was on. However, since my settings were still set to the dev version of Windows, the next time I restarted my machine, it successfully installed the preview software.

The windows version should look something like this if successful

The windows version should look something like this if successful

Once this happened, I spun up a Docker instance, and ran some simple benchmarks which revealed that Docker was communicating with my GPU!

Results of the benchmark listed in the Docker blog post linked above

Results of the benchmark listed in the Docker blog post linked above

One last step that I needed is to make sure it was working with the Jupyter Notebook instance inside the container. For this, I needed to make sure my Dockerfile contained all the proper NVIDIA drivers, so that when I built my image, everything could communicate with each other.

Benefits

Now that I was able to set this all up, does it actually improve my training speed?

Yes, definitely!

My training speed was about 3x faster, even with my low-end GPU. These are important findings, especially given that the CPU that I was using to train previously is a quite powerful i9 8th gen core. High-end GPUs are expected to have an even greater improvement of 10x+ the speed!

So, when should you use a GPU to train vs a CPU? At a very high level, GPUs are very good at doing a large number of small tasks quickly, where as CPUs are good at doing a small number of hard tasks. There’s obviously a lot more nuance there, but the key is that GPUs are ideal for deep learning tasks where in order to train the model, there’s a huge number of small calculations that need to happen.

Disclaimer

There is a lot of variation in the workflows that people use, and the process I outlined here has worked for me, as someone who has not had previous experience in utilizing GPUs in my work. It’s quite possible that there is a more simple solution that will work for you, but hopefully this has provided some general knowledge to allow you to either use this method, or find your own.