Skip to main content

Server Inference

Triton-inference-server

Serving a Torch-TensorRT model with Triton — Torch-TensorRT v1.4.0.dev0+d0af394 documentation

ONNX Runtime

https://github.com/microsoft/onnxruntime-inference-examples

https://github.com/microsoft/DeepSpeed-MII

https://github.com/microsoft/onnx-script

GitHub - open-mmlab/mmdeploy: OpenMMLab Model Deployment Framework

Triton-inference-server
ONNX Runtime