Skip to main content
Version: v2509

Component Details

AIBooster consists of two components: Server and Agent. This section introduces the elements that make up each component and aims to help understand the system impact of these components in operation.

Agent

Agent containers are designed to run continuously on all nodes that are observation targets. These containers perform regular observations of node hardware and system states, collecting performance metrics of programs running on them.

They have the following features, and some containers require privileged mode operation (launching containers with administrator privileges):

  • Node Exporter: Collection of CPU and I/O related metrics
  • DCGM Exporter: Collection of GPU metrics
  • PCM Exporter: Collection of Intel CPU/Memory Subsystem-specific metrics
  • eBPF Profiler: Collection of program execution status

Server

Server containers are designed to run on a single Linux node connected to the same network as the compute nodes where Agents operate. They can be deployed on a dedicated management node or co-located on one of the compute nodes with Agents installed.

Server containers include:

  • ClickHouse: Data storage
  • Grafana: Visualization features
  • Nginx: Reverse proxy

Additionally, the following ports need to be open on the node where Server containers run:

PortExpected Access SourcePurpose
3000User PCAccess to performance observation dashboard
8123Nodes running AgentsMetrics collection
16697Nodes running AgentsCommunication with Server node