What is AIBooster?
AIBooster is a performance engineering platform for continuously observing and improving the performance of AI workloads.
- PO: Performance Observability
- 🔍 Visualization: View usage rates and efficiency of various hardware at a glance
- 📊 Analysis: Identify software bottlenecks and discover improvement opportunities
- PI: Performance Intelligence
- ⚡ Performance Improvement: Continuously improve performance with automatic tuning
- 💰 Cost Reduction: Reduce inefficient resource usage and improve ROI
Through visualization dashboards, users can visualize the utilization efficiency of various hardware resources such as CPU, GPU, interconnect, and storage, as well as software bottlenecks, to analyze the performance characteristics of AI workloads. Furthermore, by applying optimization frameworks designed for AI workloads, efficient performance improvements are possible.
Start fast and cost-effective AI training and inference with AIBooster!
Feature Highlights
Autonomous Tuning
ZenithTune's autonomous tuning feature automatically discovers jobs that meet specific conditions in a Kubernetes environment and autonomously optimizes the hyperparameters of these jobs. By enabling this feature, you can continuously search for performance-optimal parameters without manually running tuning jobs.
For instructions on how to use this feature, please refer to this page.
Automatic Pruning
ZenithTune's automatic pruner monitors specific conditions during optimization and automatically terminates trials when they exceed (or fall below) configured thresholds. This enables efficient resource utilization by avoiding out-of-memory errors and excessively time-consuming executions, while reducing tuning time by early termination of unpromising trials.
For instructions on how to use this feature, please refer to this page.
Automatic Model Conversion for Edge Inference
AcuiRT is an automatic model conversion framework using deep learning compilers for specific hardware. It applies various model-level optimizations during the conversion process, enabling high-speed inference on target devices.

When directly using deep learning compilers like TensorRT, you need to convert the entire model at once. However, for practical models with a certain scale, it is difficult to successfully convert the model in one attempt due to various constraints. This is because hardware-dependent deep learning compilers strongly depend on supported operators and quantization methods, requiring manual intervention such as configuration and model improvements.
AcuiRT implements a flexible conversion strategy that identifies the causes when some modules cannot be converted, converts only the convertible parts, and executes the remaining modules on PyTorch. This not only improves performance through conversion but also reduces the effort involved in the model deployment process itself, accelerating development.
For instructions on how to use this feature, please refer to this page.
✨ Guides
Quick Start Guide
Learn about AIBooster overview, setup methods, and basic usage.
Performance Observation Guide
Learn how to use visualization dashboards to observe AI workload performance.
Performance Improvement Guide
Learn how to use frameworks to improve AI workload performance.