Skip to main content
Version: v2509

Observing Performance

Let's learn how to observe system performance using AIBooster's dashboard. This guide explains basic operations and how to read key metrics.

Dashboard Basic Structure

After logging in, the first screen you'll see is the Grafana home screen.

Grafana Dashboard

Main Areas

  1. Left Sidebar: Navigation menu

    • Home: Home screen
    • Dashboards: Dashboard list
    • Explore: Data analysis through user queries
    • Configuration: Settings
  2. Main Area: Dashboards and panels

  3. Top Toolbar: Time range selection, etc.

Opening Your First Dashboard

  1. Click "Dashboards" in the left sidebar
  2. Select "Performance Overview" or "Cost Overview"

Dashboards with "Details" in their names are designed to be accessed from Overview dashboards. Always start from one of these two dashboards:

  • For performance analysis: Performance Overview
  • For cost analysis: Cost Overview

How to Read Key Metrics

1. GPU Utilization

  • Meaning: How much the GPU is being used
  • Ideal value: 80-95% during training
  • Note: If below 50%, there's a bottleneck outside the GPU

2. GPU SM Activity

  • Meaning: Actual computational efficiency of GPU cores
  • Ideal value: 70% or higher
  • Note: If Utilization is high but SM Activity is low, the GPU cores are not being used efficiently

3. CPU Utilization

  • Meaning: CPU usage status
  • Ideal value: Balanced usage across cores
  • Note: If specific cores are stuck at 100%, CPU might be the bottleneck

4. Memory, Interconnect, Storage Bandwidth

  • Meaning: Effective bandwidth of various I/O
  • Ideal value: Cannot be judged solely by this metric; should be evaluated with other metrics

Basic Operations

Changing Time Range

You can change the observation period using the time range selector in the top right:

Time Range

  • Last 5 minutes: Recent 5 minutes
  • Last 1 hour: Recent 1 hour
  • Last 24 hours: Recent 24 hours
  • Custom range: Custom period

Graph Zoom and Detail Display

  1. Zoom by dragging: Drag on the graph to zoom into a specific period
  2. Maximize panel: Click the View button from the 3-dot icon in the top right of the panel
  3. Check data: Hover over the graph to display detailed values

💡 Frequently Asked Questions

Q: Metrics are not displayed

A: Please check the following:

  1. Is the setup completed correctly (are there any errors)?
  2. Is the time range appropriate (are you viewing a period with no data)?
  3. Refer to Troubleshooting

Q: I don't understand what the graph means

A: Hover over the "i" icon on each panel to see an explanation.