GPUs have more cores than CPUs, so when it comes to parallel computing of data, GPUs perform exceptionally better than CPUs even though the clock in the GPU The speed is low, and it lacks many basic management features compared to the CPU. Running a Python script on a GPU like this can be relatively fast compared to a CPU.
In this article I will teach you how Python run code on GPU However, it would assist if you memorized that the first data to be processed on a data set with a GPU is the GPU’s.
You will transfer it to memory which may require extra time, so the CPU may outperform GPU if the data set is small.
Getting started:
For now, only NVIDIA GPUs are supported. If your graphics card has a CUDA cover, you can move on with sorting things out.
- Installation
- CODE
1. Installation:
First, make sure that the Nvidia drivers are also up-to-date. Then install Anaconda to the environment when installing.
After completing all the installations, run the following commands at the command prompt.
conda install numba & conda install cudatoolkit
If Anaconda is not included in the environment, go to the Anaconda installation, find the script directory, and open a command prompt.
2. CODE
We will use numba. Jit decorator for this function which we want to count on GPU. The decorator has many parameters, but we will only work with the target parameter.
Tells the target jet to generate the codes by which it is generated (“CPU” or “Cuda”). “Cuda” is equivalent to a GPU.
However, if the CPU is passed as an argument, Jett tries to improve the code that runs faster on the CPU and also improves the speed.
from numba import jit, cuda
import numpy as np
# to measure exec time
from timeit import default_timer as timer
# normal function to run on cpu
def func(a):
for i in range(10000000):
a[i]+= 1
# function optimized to run on gpu
@jit(target =”cuda”)
def func2(a):
for i in range(10000000):
a[i]+= 1
if __name__==”__main__”:
n = 10000000
a = np.ones(n, dtype = np.float64)
b = np.ones(n, dtype = np.float32)
start = timer()
func(a)
print(“without GPU:”, timer()-start)
start = timer()
func2(a)
print(“with GPU:”, timer()-start)
Output: based on CPU = i3 6006u, GPU = 920M.
-
without GPU: 8.985259440999926
-
with GPU: 1.4247172560001218
However, it is important to note that the array is first copied from RAM to the GPU for processing, and if the function returns anything, You will copy the returned values from the GPU to the CPU.
Therefore, the CPU speed is relatively fast for small data sets, but even for small data sets, You can further improve the speed by passing the target as “CPU.” When a function written under jit tries to call another function, You must also optimize this function with jit. Otherwise, jit can produce even slower code.
Method 02
Graphics processing units (GPUs) have more cores than central processing units (CPUs); therefore, when it comes to parallel data computing, GPUs perform exceptionally efficiently compared to CPUs. GPUs have slower clock speeds and lack many basic management features than CPUs.
- What is the clock speed?
- Setting up the CUDA in your Windows machine
- Install Numba
1. What is the clock speed?
It is the rate at which a processor can complete the processing cycle. It is usually measured in MHz or GHz. A million cycles per second are equal to One MHz. while one GHz is equal to one billion cycles per second.
2. Setting up the CUDA in your Windows machine
Do you have a CUDA? Do you have a GPU that can afford CUDA?
Follow it and check it in your CMD.
Nvidia-smi
Well, you have a CUDA-capable GPU. What’s up? Follow this to check your CUDA version in your CMD.
nvcc –version
You must have the CUDA toolkit installed. Once you have installed it correctly and if you have configured PATH correctly, you will get similar results when you follow the same command.
3. Install Numba
Numba is an open-source, NumPy-Aware Python Optimizing Compiler sponsored by Anaconda, Inc. It uses the LLVM compiler project to generate machine code from Python syntax. You can install Numba in CMD using this command.
pip install numba
You can also use the Konda environment to install both Numba and CUDA toolkits. You can use this command in the anaconda prompt to install both.
conda install numba & conda install cudatoolkit
You can check the Numba version using the following commands in the Python prompt.
>>> import numba
>>> numba.__version__
Now, everything is set up, and let’s run the Python script on the GPU.
from numba import jit
import numpy as np
from timeit import default_timer as timer
# To run on CPU
def func(a):
for i in range(10000000):
a[i]+= 1# To run on GPU
@jit
def func2(x):
return x+1if __name__==”__main__”:
n = 10000000
a = np.ones(n, dtype = np.float64) start = timer()
func(a)
print(“without GPU:”, timer()-start) start = timer()
func2(a)
numba.cuda.profile_stop()
print(“with GPU:”, timer()-start)
When you follow this, you will get the following output. You can check the Performance tab in the Task Manager while following the code that the GPU will suddenly peak from 0, which will return to 0, indicating that the GPU worked.
Final Thought
These libraries develop high-speed GPU variants of popular Python libraries such as NumPy, Pandas, and Scikit-Learn.
So Python runs code on GPU easily. NVIDIA’s CUDA Python provides a driver and runtime API for existing toolkits and libraries to facilitate accelerated GPU-based processing.
Python is the most prominent programming language for science, engineering, data analytics, and deep learning applications.
Using the CUDA toolkit, you can speed up your C or C ++ applications by updating the computationally intensive parts of your code to run on GPUs.
To speed up your applications, you can call functions from drop-in libraries and create custom applications using languages including C, C ++, Fortran, and Python.
Related Article:
What’s the Main Difference Between SIMD And GPU? (Explained)