Version: v2602

intelligence.acuirt.observe.profile

profile_function

def profile_function(func, enable_trace=False, *args, **kwargs)

Profile function execution with CUDA NVTX tracing.

This function enables detailed performance profiling by wrapping the target function with NVTX range markers. When enable_trace is True, it pushes a new NVTX range for each function call and pops it on return, providing hierarchical profiling information.

Arguments:

func Callable - Function to profile
enable_trace bool - Whether to enable NVTX tracing
*args - Variable length positional arguments for the target function
**kwargs - Arbitrary keyword arguments for the target function

Returns:

Any - The result of the target function execution

Notes:

Requires torch.cuda.nvtx for tracing. Uses sys.settrace to monitor function calls.

profile_torch_module

def profile_torch_module(model: nn.Module,
                         data_loader,
                         data_loader_post_process: Optional[Any],
                         max_depth: int = -1,
                         settings: Optional[Dict[str, Any]] = None)

Profile a PyTorch nn.Module using torch.profiler.

Arguments:

model nn.Module - The PyTorch model to profile.
data_loader Iterable - Data loader providing input data for the model.
data_loader_post_process Optional[Callable] - Function to post-process each batch before inference.
max_depth int - Maximum depth for profiling nn.Module inference. Defaults to -1 (no limit).
settings Optional[Dict[str, Any]] - Settings for torch.profiler. If None, default settings are used.

Returns:

Tuple[List[Any], torch.profiler.profile]: A tuple containing the list of model outputs and the profiling information.

profile_function​

profile_torch_module​

profile_function

profile_torch_module