Skip to main content
Version: Next

Detailed Configuration Guide

This guide explains the component details, system requirements, and detailed setup procedures required when installing AIBooster in various environments.

Component Details

Agent

Agent containers are assumed to be running continuously on all nodes to be observed. These containers perform fixed-point observation of the node's hardware and system status, collecting performance metrics for programs running on them.

They provide the following features, with some containers requiring privileged mode operation (container startup with administrator privileges):

  • Node Exporter: Collects CPU and I/O related metrics
  • DCGM Exporter: Collects GPU metrics
  • PCM Exporter: Collects Intel CPU/Memory Subsystem specific metrics
  • eBPF Profiler: Collects program execution status

Server

Server containers are assumed to run on a single Linux node connected to the same network as the compute nodes where Agents are running. They can be placed on a dedicated management node or co-located with one of the compute nodes with Agent installed.

The containers included in Server are:

  • ClickHouse: Stores data
  • Grafana: Visualization functionality
  • Nginx: Reverse proxy

Additionally, the following ports must be open on the node where Server containers run:

PortExpected Access SourcePurpose
3000User's PCAccess to performance observation dashboard
9000Nodes running AgentMetric collection

Configuration Pattern Selection

ConfigurationFeaturesRecommended Use
Single NodeComplete on 1 machineVerification, learning, small-scale PoC
Multi NodeDistributed across multiple machinesProduction use, large-scale clusters

Single Node Configuration

A configuration where both AIBooster Server and Agent run on a single machine.

Pattern 1: Minimal Configuration for Verification

single-node-1

Install both AIBooster Server and AIBooster Agent on a single GPU-equipped workstation/server. Connect a monitor and open the dashboard directly to view performance information. This is the shortest route for "just wanting to try it out" on offline verification or benchmark machines. No network configuration is required.

Pattern 2: Multi-User Configuration

single-node-2

Install both AIBooster Server and AIBooster Agent on a single GPU-equipped workstation/server. Users view the dashboard provided by the server through a browser from their personal PCs via TCP port 3000. Ideal for small-scale PoCs where multiple people want to view the dashboard.

Multi-Node Configuration

A production-oriented configuration where AIBooster Server and Agents run distributed across multiple machines.

multi-node-1

Install AIBooster Server on the management node and AIBooster Agent on each GPU compute node. Users view the dashboard provided by the management node through a browser from their personal PCs via TCP port 3000. This is the recommended configuration for most GPU cluster server systems.

Pattern 2: Compute Node Co-location

multi-node-2

When no specific management node exists, select one GPU-equipped node and install both AIBooster Server and its dedicated AIBooster Agent on it. Install only Agent on the remaining GPU-equipped nodes. Users view the dashboard provided by the GPU-equipped node with AIBooster Server installed through a browser from their personal PCs via TCP port 3000.

System Requirements

Ensure all nodes to be set up meet the following requirements:

OS/Software Requirements

  • Ubuntu (>=22.04)
  • Linux Kernel (>=5.15)

Additionally, if the following software is not installed, setup will be performed automatically:

Network/SSH/Permission Requirements

SSH Connection Requirements

  • Network access via SSH must be available
  • SSH port on each node must be open

User/Permission Requirements

  • Must be able to log in with the same username on all nodes
  • Login user must have privileges to escalate to administrator (sudo privileges)
  • sudo password must be set to the same value on all nodes

The installer SSH connects to all nodes with the current username, so the same username and sudo password are required on all nodes.

Setup Procedures

1. Running the Installer

Copy and execute the following command in your terminal:

curl -sLO assets.aibooster.fixstars.com/faibup.sh && sh faibup.sh

Setup is completed by answering configuration questions in the terminal.

2-a. Single Node Setup

The installer will ask the following questions:

faibup-single

  1. Target node address: Enter the IP address or resolvable hostname of the target node
  2. Target node SSH port: Enter the SSH port of the target node

2-b. Multi-Node Setup

The installer will ask the following questions in order:

faibup-multi

Note that steps 3-4 will be repeated.

  1. Server node address: Enter the IP address or resolvable hostname of the node where AIBooster Server will run

  2. Server node SSH port: Enter the SSH port of the node where AIBooster Server will run

  3. Agent node address: Enter the address of the compute node where you want to install AIBooster Agent

  4. Agent node SSH port: Enter the SSH port of the Agent node

  5. After entering all nodes, press Enter on an empty line to confirm

3. Entering Authentication Credentials

During installation, you will need to enter the following information for SSH connection and sudo privilege acquisition:

  • SSH password (required for sudo privilege escalation even if public key authentication is configured)
  • sudo password (must be set to the same value on all nodes)

4. Installation Complete

When "AIBooster setup completed successfully!" is displayed, the setup has completed normally.

Open the URL shown at the end in your browser and confirm that the dashboard is displayed. A URL for accessing AIBooster documentation is also displayed.

Dashboard Initial Setup

Grafana First Login

AIBooster uses Grafana for performance data visualization. When accessing the dashboard from a browser, the following screen is displayed for the first time:

Grafana-login

Password Setup

  1. Enter admin as both username and initial password
  2. You will be prompted to change the administrator password, enter any password

Setup Completion Confirmation

When the following screen is displayed, the dashboard initial setup is complete:

Grafana-top