integration.aibooster.client
AIBoosterClient Objects
class AIBoosterClient()
AIBooster API client for interacting with AIBooster server endpoints.
This client provides both low-level (raw JSON) and high-level (structured data) access to the AIBooster server API. The public methods return structured data for ease of use, while methods with _ prefix return the original JSON responses.
__init__
def __init__(base_url: str,
timeout: float = 30.0,
skip_health_check: bool = False)
Initialize the AIBooster client.
Arguments:
base_url
- Base URL for the AIBooster APItimeout
- Request timeout in secondsskip_health_check
- Skip initial connection verification
Raises:
ConnectionError
- If health check fails (unless skip_health_check=True)
health_check
def health_check() -> bool
Check if the AIBooster server is healthy.
Returns:
True if the server is healthy, False otherwise
get_dcgm_metrics
def get_dcgm_metrics(
metric_name: str,
begin_time: datetime | None = None,
end_time: datetime | None = None,
agent_gpu_filter: dict[str, list[int]] | None = None
) -> dict[str, dict[int, list[dict[str, Any]]]]
Get all DCGM metrics for the specified period.
This method automatically handles pagination to retrieve all available metrics data within the specified time range.
Arguments:
metric_name
- Name of the DCGM metric to retrieve. Allowed values:- DCGM_FI_DEV_GPU_UTIL
- DCGM_FI_DEV_MEM_COPY_UTIL
- DCGM_FI_DEV_SM_CLOCK
- DCGM_FI_DEV_MEM_CLOCK
- DCGM_FI_DEV_FB_USED
- DCGM_FI_DEV_FB_FREE
- DCGM_FI_DEV_POWER_USAGE
- DCGM_FI_DEV_TEMPERATURE_CURRENT
- DCGM_FI_DEV_SM_OCCUPANCY
- DCGM_FI_DEV_MEMORY_TEMP
- DCGM_FI_DEV_PCIE_TX_THROUGHPUT
- DCGM_FI_DEV_PCIE_RX_THROUGHPUT
- DCGM_FI_DEV_MEMORY_UTIL
begin_time
- Begin time for the query (defaults to UNIX epoch)end_time
- End time for the query (defaults to current time)agent_gpu_filter
- Dict of agent_name -> [gpu_indices] to filter specific GPUs (None = all)
Returns:
Dictionary
- hostname -> gpu_index -> list of {timestamp, value} dicts
Raises:
requests.RequestException
- If the request fails
get_dcgm_metrics_reduction
def get_dcgm_metrics_reduction(
metric_name: str,
reduction: str = "mean",
begin_time: datetime | None = None,
end_time: datetime | None = None,
agent_gpu_filter: dict[str, list[int]] | None = None) -> float | None
Get statistical reduction of DCGM metrics.
This method retrieves all DCGM metrics and computes statistical summaries for each GPU across the specified time range.
Arguments:
metric_name
- Name of the DCGM metric to retrieve. Allowed values:- DCGM_FI_DEV_GPU_UTIL
- DCGM_FI_DEV_MEM_COPY_UTIL
- DCGM_FI_DEV_SM_CLOCK
- DCGM_FI_DEV_MEM_CLOCK
- DCGM_FI_DEV_FB_USED
- DCGM_FI_DEV_FB_FREE
- DCGM_FI_DEV_POWER_USAGE
- DCGM_FI_DEV_TEMPERATURE_CURRENT
- DCGM_FI_DEV_SM_OCCUPANCY
- DCGM_FI_DEV_MEMORY_TEMP
- DCGM_FI_DEV_PCIE_TX_THROUGHPUT
- DCGM_FI_DEV_PCIE_RX_THROUGHPUT
- DCGM_FI_DEV_MEMORY_UTIL
reduction
- Statistical reduction to apply ("mean", "max", "min", "median")begin_time
- Begin time for the query (defaults to UNIX epoch)end_time
- End time for the query (defaults to current time)agent_gpu_filter
- Dict of agent_name -> [gpu_indices] to filter specific GPUs (None = all)
Returns:
Single statistical value as float, or None if no data is available
Raises:
ValueError
- If reduction type is invalidrequests.RequestException
- If the request fails