Basic Usage
Model conversion by AcuiRT consists of the following three steps.
1. Create Model and Dataset
Create the model and dataset to be accelerated. Please implement the model in PyTorch’s nn. Module format.
2. Specify the conversion method
Specify the conversion method by writing it in the config. In AcuiRT, the following deep learning compilers are currently supported.
- TensorRT
3. Execute the conversion
Run the Python code to perform the conversion. Below is an example of applying PTQ int8 quantization to ResNet50 and converting it to TensorRT.
import torch
from acuirt.convert.convert import convert_model
from acuirt.inference.inference import load_runtime_modules
from torchvision.models import resnet50
def main():
resnet = resnet50()
resnet = resnet.cuda().eval()
# Settings for converting to TensorRT with int8 quantization (PTQ)
config = {
"rt_mode": "onnx",
"auto": True,
"int8": True,
}
# Please specify the path to save the converted model.
path = "/path/to/save/model"
# Create a dummy dataset
# By passing an iterable dataset, calibration is performed automatically.
data = [((torch.randn(1, 3, 224, 224), ), {}) for _ in range(10)]
# Convert to TensorRT and execute calibration.
# The converted model will be saved to path.
# Also, the information of the converted model is stored in a variable named summary of type dict.
summary = convert_model(resnet, config, path, False, data)
# Load the inference engine for TensorRT.
model = load_runtime_modules(resnet, summary, path)
# Run inference.
args, _ = data[0]
args = [arg.cuda() for arg in args]
model(*args)
if __name__ == "__main__":
main()