Version: v2602

Executing model conversion

In this step, we integrate AcuiRT into the source code of the open DETR model, and using AcuiRT's conversion, inference, and evaluation workflow, we verify changes in accuracy and latency.

Execute conversion workflow

Define a workflow for converting, inference, and evaluation of DETR using AcuiRT's ConversionWorkflow class.

In this tutorial, ConversionWorkflow has already been introduced.

Create a conversion config for use with AcuiRT. In this config, conversion is attempted starting from the top-level module, and if it fails for any reason, it recursively attempts conversion on the child modules.
aibooster_misc/config.json
```
{
    "rt_mode": "onnx",
    "auto": true
}
```

Execute test.py and perform automatic conversion, inference, and evaluation with AcuiRT.

python main.py --batch_size 1 --no_aux_loss --eval --backbone resnet101 --resume ./detr-r101-2c7b67e5.pth --coco_path /path/to/dataset --trt-engine-dir exps/baseline --trt-config aibooster_misc/config.json

As with the reproduction operation of DETR, logs of recognition accuracy as shown below are output.

DONE (t=0.09s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.445
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.672
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.450
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.202
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.475
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.641
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.374
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.536
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.567
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.283
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.587
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.740

AcuiRT also outputs summary information of the conversion result.

Workflow Report
Conversion Rate: 73/447
Num Modules: 63
Accuracy: 0.4450028287644359
Latency: 66.98 ms

The summary information of the conversion result is as follows.

Conversion Rate: Indicates that 73 out of 447 modules were successfully converted.
Num Modules: The number of TensorRT engines generated during conversion.
Accuracy: It is the inference accuracy (AP) after conversion.
Latency: Inference time (ms).

Analyzing the Conversion Result

Comparing the summary information with the PyTorch model, we see that both accuracy and latency have decreased. This is because, while AcuiRT converted multiple modules to a TensorRT engine, some modules failed to convert and are still being inferred in PyTorch. As a result, overhead such as data transfer between the PyTorch modules and the TensorRT engine occurs.

Model	AP (Accuracy)	Latency
PyTorch	0.5310	60.92 ms
Introduce AcuiRT	0.4450	66.98 ms

In the next step, we describe how to analyze and identify the causes of conversion failures using the report information generated by AcuiRT.

Execute conversion workflow​

Analyzing the Conversion Result​

Execute conversion workflow

Analyzing the Conversion Result