The only way to make serious CUDA debugging is by using Parallel Nsight by NVIDIA, but it has a drawback: it takes two GPUs. So, you can remotely connect to a GPU-powered machine or you can do it locally, by installing two NVIDIA video cards on the same host. I chose the latter, and bought a GTX 590, a dual-GPU video card that NVIDIA itself considered "not profitable" and consequently stepped down the deliveries. Two GPUs are needed because one of them actually runs the kernels, whilst the other one "inspects" the memory banks of the former. Because of this, you must explicitly disable the SLI functionality, which makes the two GPU a single one:
|Disable SLI by clicking "Disable multi-GPU mode"|
Parallel Nsight has another, implicit drawback: it is a Windows-only application. In particular, you are going to need the full version Microsoft Visual Studio (VS Express is not supported). Windows has a functionality named timeout detection and recovery (TDR) that automatically determines if something is wrong with the video card driver and resets it. As the debug activity basically halts the execution of the kernels, TDR must be disabled:
|Disable TDR by setting WDDM TDR enabled to "false"|
Microsoft created some time ago a framework for the creation of nice GUIs in the Windows environment: the Windows Presentation Foundation (WPF). Naturally, they mess up with Nsight, so you have to turn WPF off. It can be done by looking for the DisableWpfHardwareAcceleration.reg file in the "common" directory of Parallel Nsight ("C:\Program Files\NVIDIA Parallel Nsight 1.51\Common" on 32-bit architectures, and "C:\Program Files (x86)\NVIDIA Parallel Nsight 1.51\Common" on 64-bit architectures). Double click on the file to install it.
WPF is not the only thing that breaks Nsight's work: the Aero interface does not allow debugging too. You can disable Aero it by setting a non-Aero theme:
Are we done? Not yet! Remember to open the Parallel Nsight Monitor before debugging your kernels. Moreover, if you try compiling a kernel in Visual Studio and debugging it, you could receive a message that basically says that Nsight can't find your executable. You have to explicitly tell where your compiled files are:
|Fill the "Working directory" field with the absolute path to your compiled file.|
Now you can enjoy the CUDA debugging features of Parallel Nsight! Yikes!