Module-specific conversion
- In AcuiRT, there are multiple options for performing module conversion.
rt_mode
specifies the conversion method for each module. In the current version, the following conversion methods are supported.- onnx: After exporting in ONNX format, convert to a TensorRT model
- torch2trt: Convert directly to a TensorRT model using torch2trt
- When the
auto
flag is enabled, AcuiRT automatically falls back to conversions within the module and converts the largest convertible module.
Example Configuration
- The following example assumes a model with two modules, module_1 and module_2.
- After outputting module_1 in ONNX format, it applies int8 and fp16 quantization and converts it to a TensorRT model. If the conversion of module_1 fails, it automatically attempts conversion on module_1’s child modules (fallback).
- module_2 uses conversion via torch2trt and only performs fp16 quantization. If the conversion of module_2 fails, no conversion will be performed on its child modules.
model = dict(
module_1 = dict(
rt_mode="onnx",
int8=True,
fp16=True,
auto=True,
),
module_2 = dict(
rt_mode="torch2trt",
int8=False,
fp16=True,
)
)