What is AIBooster?
AIBooster is a performance engineering platform for continuously observing and improving the performance of AI workloads.
- PO: Performance Observability
- 🔍 Visualization: View usage rates and efficiency of various hardware at a glance
- 📊 Analysis: Identify software bottlenecks and discover improvement opportunities
- PI: Performance Intelligence
- ⚡ Performance Improvement: Continuously improve performance with automatic tuning
- 💰 Cost Reduction: Reduce inefficient resource usage and improve ROI
Through visualization dashboards, users can visualize the utilization efficiency of various hardware resources such as CPU, GPU, interconnect, and storage, as well as software bottlenecks, to analyze the performance characteristics of AI workloads. Furthermore, by applying optimization frameworks designed for AI workloads, efficient performance improvements are possible.
Start fast and cost-effective AI training and inference with AIBooster!
Feature Highlights
Customizing Performance Observability to Fit User Environments
AIBooster's performance observability feature monitors AI workloads running on clusters to provide users with a range of insights, including:
- Understanding macro performance trends across the entire cluster and over medium- to long-term periods
- Monitoring hardware utilization efficiency by node or device
- Analyzing differences in performance characteristics between workloads
In this release, to provide a more flexible observability experience tailored to user requirements, the following parameters can now be easily modified:
- Metric collection intervals
- User-defined tags attached to traces
These customization features offer the following benefits:
-
Minimizing monitoring overhead by adjusting granularity
For example, if you want to focus on medium- to long-term trends, millisecond-level metric collection is excessive. By flexibly changing the metric collection interval, you can reduce resource consumption. This enables deployment in diverse environments while keeping monitoring overhead in check.
For details, please refer to Changing Metric Collection Intervals.
-
Enabling workload performance analysis from unique user perspectives
By utilizing user-defined tags, you can categorize collected workloads according to your own specific criteria. Aggregating data based on these classifications allows for deeper performance analysis, such as by model type, runtime configuration, or target dataset.
User-defined tags can be modified on the Performance Details / Job screen.
AcuiRT Model Conversion Diagnostic Report
AcuiRT is a framework that assists in converting AI models for inference. It accelerates inference processing on target devices by applying model conversions through deep learning compilers.
However, for practical models with a certain scale, it is often difficult to successfully convert models in one attempt due to reasons such as:
- Input models cannot be converted to static graphs
- Deep learning compilers or target hardware do not support some operators
To avoid these problems, manual intervention such as compilation settings and AI model improvements has been essential in the past.
In this release, we have significantly enhanced the capability to diagnose model conversion results in detail. Engineers can identify problems that occur when deploying to target devices by checking conversion results through CLI output or report files, and can quickly iterate through the development cycle of solving them through refactoring.

The details of features included in this release are as follows:
Visualization of Conversion Results
You can understand the success/failure status of model conversion in detail.
- Layer-level Conversion Success Rate: You can check what percentage of layers in the entire model were successfully converted.
- Conversion Results per Layer: You can review the conversion success or failure status for all layers in a list. For failed layers, error messages are also output.
Performance Profiling
You can analyze the performance characteristics of converted models in detail.
- Inference Speed (Latency): You can measure the inference processing time of the converted model.
- Inference Time Breakdown per Layer: You can individually check the time taken for processing each layer, which is helpful for identifying bottlenecks.
Recognition Accuracy
You can obtain information to investigate accuracy degradation issues during model conversion in detail.
- Recognition Accuracy: You can measure the inference accuracy of the converted model and check accuracy degradation from the pre-conversion model.
Application Example
We introduce a case study of converting a 2D object detection model for NVIDIA GPUs.
When converting this model as-is, the conversion success rate was only 16% of all layers. Additionally, due to the model being only partially converted, the inference speed actually decreased compared to before conversion. However, after checking the failed layers and error details in the diagnostic report and performing model refactoring, we achieved a 100% conversion success rate with just 4 hours of refactoring. Furthermore, it was confirmed that the converted model showed approximately 1.25x improvement in inference speed.
For detailed usage instructions, refer to Analysis and Refactoring of Conversion Results in Complex Models.
✨ Guides
Quick Start Guide
Learn about AIBooster overview, setup methods, and basic usage.
Performance Observation Guide
Learn how to use visualization dashboards to observe AI workload performance.
Performance Improvement Guide
Learn how to use frameworks to improve AI workload performance.