The dstat is an awesome little tool which allows you to get resource statistics for your Linux box. It has a modular architecture which allows you to develop additional plugins and itโs easy to use. Recently I was profiling a Deep Learning pipeline developed with Keras and Tensorflow and I needed detailed statistics about the CPU, Hard Disk and GPU usage. The first two are available out-of-the-box by dstat, nevertheless as far as I know there is no plugin for monitoring GPU usage for NVIDIA graphics cards.
Thankfully it is super easy to write a python plugin for dstat. I have already sent a pull-request on the official repo but since new versions are released relatively rarely here are some instructions on how to set up the dstat NVIDIA GPU usage plugin on your box.
The following commands are tested on Ubuntu 16.04 and they will help you install dstat, the Python NVIDIA Management Library and my dstat nvidia plugin:
sudo apt-get install dstat #install dstat sudo pip install nvidia-ml-py #install Python NVIDIA Management Library wget https://raw.githubusercontent.com/datumbox/dstat/master/plugins/dstat_nvidia_gpu.py sudo mv dstat_nvidia_gpu.py /usr/share/dstat/ #move file to the plugins directory of dstat
To get all the default statistics along with GPU usage (percentage) type the following command:
dstat -a --nvidia-gpu ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- gpu-u usr sys idl wai hiq siq| read writ| recv send| in out | int csw |total 2 1 96 0 0 0|5816k 15M| 0 0 | 0 0 | 45k 98k| 68 0 1 98 0 0 0| 57M 128k| 104B 902B| 0 0 | 42k 85k| 50 8 7 84 1 0 0| 152M 0 | 292B 448B| 0 0 | 52k 93k| 39 1 1 97 1 0 0| 111M 0 | 52B 374B| 0 0 | 51k 116k| 62 0 1 98 1 0 0| 129M 0 | 80B 416B| 0 0 | 43k 85k| 92 0 2 98 0 0 0| 0 0 | 52B 374B| 0 0 | 41k 83k| 81
To get all the usage statistics for each GPU use the following command:
dstat --nvidia-gpu -f -------------------------------------------gpu-usage-nvidia------------------------------------------ total gpu0 gpu1 gpu2 gpu3 gpu4 gpu5 gpu6 gpu7 gpu8 gpu9 gpu10 gpu11 gpu12 gpu13 gpu14 gpu15 19 23 22 21 21 20 22 23 25 15 18 16 16 16 18 16 14 18 21 20 18 22 21 21 22 21 15 15 14 14 14 15 16 13 10 14 9 13 8 9 11 9 12 9 9 10 10 8 7 9 9 18 20 22 19 21 20 21 21 22 14 15 14 15 14 15 15 15 20 24 22 23 24 25 22 22 22 16 16 16 16 16 16 18 16 15 21 18 19 18 17 17 16 18 14 13 13 14 13 12 11 11 20 24 22 22 24 25 23 24 22 16 18 16 14 17 17 17 15 19 29 18 23 21 22 21 20 21 18 16 16 18 14 14 17 17
The plugin fetches the number of available GPUs on the system and samples 10 times the usage metric for each GPU. Sampling multiple times will hopefully return smoother metrics than getting a single measurement. After that it averages the usage across all GPUs and returns the results to the user. The source code of the plugin is available here.
Hope you enjoy it, happy GPU programming! ๐
2013-2023 © Datumbox. All Rights Reserved. Privacy Policy | Terms of Use
Module dstat_nvidia_gpu failed to load. (The “pynvml” library is missing from this system.)
Just install the pynvml lib with pip.
Works perfectly, thanks.
I get a bucnh of errors and at last “pynvml.NVMLError_NotSupported: Not Supported”
So, as I imagine, my GeForce gtx460 is not supported, right?
Thanks
Most likely yes. That’s an error from the pynvml lib.
Are these expected to run on Ubuntu 18.04 as well? I get “pynvml” library is missing error, but pynvml is installed. Am I missing something?
# dstat -a –nvidia-gpu
Module dstat_nvidia_gpu failed to load. (The “pynvml” library is missing from this system.)
–total-cpu-usage– -dsk/total- -net/total- —paging– —system–
usr sys idl wai stl| read writ| recv send| in out | int csw
0 0 100 0 0|2375k 368k| 0 0 | 0 0 |2298 4100
$ pip list | grep -i pynvml
pynvml 8.0.3