Showing posts with label cuda. Show all posts
Showing posts with label cuda. Show all posts


Using the NVIDIA Python plugin for Ganglia monitoring under Bright Cluster Manager

The github repo for Ganglia gmond Python plugins contains a plugin for monitoring NVIDIA GPUs. This presumes that the NVIDIA Deployment Kit, which contains the NVML (management library), is installed via the normal means into the usual places. If you are using Bright Cluster Manager, you would have used Bright's cuda60/tdk to do the installation. That means that the library is not in one of the standard library directories. To fix it, just modify the /etc/init.d/gmond init script. Near the top, modify the LD_LIBRARY_PATH:
export LD_LIBRARY_PATH=/cm/local/apps/cuda/libs/current/lib64
The modifications to Ganglia Web, however, are out of date. I will make another post once I figure out how to do modify Ganglia Web to display the NVIDIA metrics.

UPDATE: Well, turns out there seems to be no need to modify the Ganglia Web installation. Under the host view, there is a tab for "gpu metrics" which shows 22 available metrics.


NVIDIA Nsight Eclipse Edition

One of the new products announced along with CUDA 5 at the recent GPU Technology Conference was NVIDIA Nsight Eclipse Edition, which runs on Linux and Mac OS X. Previously, the only IDE available was Nsight Visual Studio which ran only on Windows.

I attended the demo talk for Nsight Eclipse, and it seemed a well thought out product. It gives access to all running threads on all cores, optimization suggestions, debugging interface, etc. Plus the usual Eclipse features like refactoring, build, version control. Watch the video:

Nsight Eclipse Edition is distributed as a pre-built binary, i.e. you can't just point Eclipse to a new software source. And, you have to be in the registered developer program to get access to the download.

Once you install the CUDA Toolkit, say in CUDAHOME=/usr/local/cuda, the nsight executable is in ${CUDAHOME}/libnsight.