intelligence.acuirt.observe.profile
profile_function
def profile_function(func, enable_trace=False, *args, **kwargs)
Profile function execution with CUDA NVTX tracing.
This function enables detailed performance profiling by wrapping the target function with NVTX range markers. When enable_trace is True, it pushes a new NVTX range for each function call and pops it on return, providing hierarchical profiling information.
Arguments:
funcCallable - Function to profileenable_tracebool - Whether to enable NVTX tracing*args- Variable length positional arguments for the target function**kwargs- Arbitrary keyword arguments for the target function
Returns:
Any- The result of the target function execution
Notes:
Requires torch.cuda.nvtx for tracing. Uses sys.settrace to monitor function calls.
profile_torch_module
def profile_torch_module(model: nn.Module,
data_loader,
data_loader_post_process: Optional[Any],
max_depth: int = -1,
settings: Optional[Dict[str, Any]] = None)
Profile a PyTorch nn.Module using torch.profiler.
Arguments:
modelnn.Module - The PyTorch model to profile.data_loaderIterable - Data loader providing input data for the model.data_loader_post_processOptional[Callable] - Function to post-process each batch before inference.max_depthint - Maximum depth for profiling nn.Module inference. Defaults to -1 (no limit).settingsOptional[Dict[str, Any]] - Settings for torch.profiler. If None, default settings are used.
Returns:
Tuple[List[Any], torch.profiler.profile]: A tuple containing the list of model outputs and the profiling information.