Version: v2509

integration.aibooster.client

AIBoosterClient Objects

class AIBoosterClient()

AIBooster API client for interacting with AIBooster server endpoints.

This client provides both low-level (raw JSON) and high-level (structured data) access to the AIBooster server API. The public methods return structured data for ease of use, while methods with _ prefix return the original JSON responses.

init

def __init__(base_url: str,
             timeout: float = 30.0,
             skip_health_check: bool = False)

Initialize the AIBooster client.

Arguments:

base_url - Base URL for the AIBooster API
timeout - Request timeout in seconds
skip_health_check - Skip initial connection verification

Raises:

ConnectionError - If health check fails (unless skip_health_check=True)

health_check

def health_check() -> bool

Check if the AIBooster server is healthy.

Returns:

True if the server is healthy, False otherwise

get_dcgm_metrics

def get_dcgm_metrics(
    metric_name: str,
    begin_time: datetime | None = None,
    end_time: datetime | None = None,
    agent_gpu_filter: dict[str, list[int]] | None = None
) -> dict[str, dict[int, list[dict[str, Any]]]]

Get all DCGM metrics for the specified period.

This method automatically handles pagination to retrieve all available metrics data within the specified time range.

Arguments:

metric_name - Name of the DCGM metric to retrieve. Allowed values:
- DCGM_FI_DEV_GPU_UTIL
- DCGM_FI_DEV_MEM_COPY_UTIL
- DCGM_FI_DEV_SM_CLOCK
- DCGM_FI_DEV_MEM_CLOCK
- DCGM_FI_DEV_FB_USED
- DCGM_FI_DEV_FB_FREE
- DCGM_FI_DEV_POWER_USAGE
- DCGM_FI_DEV_TEMPERATURE_CURRENT
- DCGM_FI_DEV_SM_OCCUPANCY
- DCGM_FI_DEV_MEMORY_TEMP
- DCGM_FI_DEV_PCIE_TX_THROUGHPUT
- DCGM_FI_DEV_PCIE_RX_THROUGHPUT
- DCGM_FI_DEV_MEMORY_UTIL
begin_time - Begin time for the query (defaults to UNIX epoch)
end_time - End time for the query (defaults to current time)
agent_gpu_filter - Dict of agent_name -> [gpu_indices] to filter specific GPUs (None = all)

Returns:

Dictionary - hostname -> gpu_index -> list of {timestamp, value} dicts

Raises:

requests.RequestException - If the request fails

get_dcgm_metrics_reduction

def get_dcgm_metrics_reduction(
        metric_name: str,
        reduction: str = "mean",
        begin_time: datetime | None = None,
        end_time: datetime | None = None,
        agent_gpu_filter: dict[str, list[int]] | None = None) -> float | None

Get statistical reduction of DCGM metrics.

This method retrieves all DCGM metrics and computes statistical summaries for each GPU across the specified time range.

Arguments:

metric_name - Name of the DCGM metric to retrieve. Allowed values:
- DCGM_FI_DEV_GPU_UTIL
- DCGM_FI_DEV_MEM_COPY_UTIL
- DCGM_FI_DEV_SM_CLOCK
- DCGM_FI_DEV_MEM_CLOCK
- DCGM_FI_DEV_FB_USED
- DCGM_FI_DEV_FB_FREE
- DCGM_FI_DEV_POWER_USAGE
- DCGM_FI_DEV_TEMPERATURE_CURRENT
- DCGM_FI_DEV_SM_OCCUPANCY
- DCGM_FI_DEV_MEMORY_TEMP
- DCGM_FI_DEV_PCIE_TX_THROUGHPUT
- DCGM_FI_DEV_PCIE_RX_THROUGHPUT
- DCGM_FI_DEV_MEMORY_UTIL
reduction - Statistical reduction to apply ("mean", "max", "min", "median")
begin_time - Begin time for the query (defaults to UNIX epoch)
end_time - End time for the query (defaults to current time)
agent_gpu_filter - Dict of agent_name -> [gpu_indices] to filter specific GPUs (None = all)

Returns:

Single statistical value as float, or None if no data is available

Raises:

ValueError - If reduction type is invalid
requests.RequestException - If the request fails

AIBoosterClient Objects​

__init__​

health_check​

get_dcgm_metrics​

get_dcgm_metrics_reduction​

AIBoosterClient Objects

init

health_check

get_dcgm_metrics

get_dcgm_metrics_reduction