メインコンテンツまでスキップ
バージョン: v2511

intelligence.acuirt.observe.profile

profile_function

def profile_function(func, enable_trace=False, *args, **kwargs)

Profile function execution with CUDA NVTX tracing.

This function enables detailed performance profiling by wrapping the target function with NVTX range markers. When enable_trace is True, it pushes a new NVTX range for each function call and pops it on return, providing hierarchical profiling information.

Arguments:

  • func Callable - Function to profile
  • enable_trace bool - Whether to enable NVTX tracing
  • *args - Variable length positional arguments for the target function
  • **kwargs - Arbitrary keyword arguments for the target function

Returns:

  • Any - The result of the target function execution

Notes:

Requires torch.cuda.nvtx for tracing. Uses sys.settrace to monitor function calls.

profile_torch_module

def profile_torch_module(model: nn.Module,
data_loader,
data_loader_post_process: Optional[Any],
max_depth: int = -1,
settings: Optional[Dict[str, Any]] = None)

Profile a PyTorch nn.Module using torch.profiler.

Arguments:

  • model nn.Module - The PyTorch model to profile.
  • data_loader Iterable - Data loader providing input data for the model.
  • data_loader_post_process Optional[Callable] - Function to post-process each batch before inference.
  • max_depth int - Maximum depth for profiling nn.Module inference. Defaults to -1 (no limit).
  • settings Optional[Dict[str, Any]] - Settings for torch.profiler. If None, default settings are used.

Returns:

Tuple[List[Any], torch.profiler.profile]: A tuple containing the list of model outputs and the profiling information.