keywords: Graphics, PIX, Profiling, GPU Capture, Wave Distribution

GPU Frame Capture

If capture GPU Frame from PIX GUI on local machine, even if TDR detection is turned off, sometimes PIX GUI may crash while parsing data. PIX recommends analysising remotely.
To address this issue, you should do profiling using programmatic API.

Steps:
Replace the line PIXBeginCapture2() with PIXGpuCaptureNextFrames() since the output file (*.wpix) can’t be open (cause a crash when parsing the data):

PIXGpuCaptureNextFrames(TEXT("E:/pixtest.wpix"), 1);

Then build game in Development, and start game with argument:

MyGame.exe -attachPIX

Then run command pix.GpuCaptureFrame in console at runtime, after a while of stun, finally the data file will be outputed in directory E:/.

Open it and start analysising. Overview of result:

Dispatching diagram between Async Compute Queue and Graphics Queue:

Wave Intrinsics (Wavefront):

There’s a switch named r.D3D12.AutoAttachPIX in UE5/Engine/Config/ConsoleVariables.ini, it’s just used for profiling from programmatic API, not PIX GUI.

GPU Timing Capture
//
//  timing capture
//
PIXCaptureParameters captureParams = {};

captureParams.TimingCaptureParameters.CaptureGpuTiming = TRUE;
captureParams.TimingCaptureParameters.CaptureCallstacks = TRUE;
captureParams.TimingCaptureParameters.CaptureCpuSamples = TRUE;
captureParams.TimingCaptureParameters.CpuSamplesPerSecond = 4000;

captureParams.TimingCaptureParameters.CaptureStorage = PIXCaptureParameters::Memory;
captureParams.TimingCaptureParameters.FileName = wstrFilename;
captureParams.TimingCaptureParameters.MaximumToolingMemorySizeMb = 4096;

XSF_ERROR_IF_FAILED(PIXBeginCapture(PIX_CAPTURE_TIMING, &captureParams));

Origin: Timing Capture

How to capture GPU Frame for D3D11

By default PIX can’t capture GPU Frame on DirectX 11, so how to address this issue?
Solution:
Check Force D3D11On12.

References

Hardware Counters in GPU Captures
https://devblogs.microsoft.com/pix/hardware-counters-in-gpu-captures/

Occupancy explained (AMD RGP Limiters Graph)
https://gpuopen.com/learn/occupancy-explained/

Optimizing GPU occupancy and resource usage with large thread groups
https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/


Learning is like rowing upstream: not to advance is to drop back. -Chinese Proverbs