NVIDIA introduced TensorRT for RTX, a new AI library optimized for the Windows 11 platform, at the Microsoft Build 2025 event. The new system was developed to provide high performance in local AI applications running specifically on NVIDIA RTX graphics cards.
NVIDIA to provide performance gains with TensorRT for RTX
TensorRT for RTX will be the desktop version of the NVIDIA TensorRT system, previously used only in data center technologies. This new version provides over 50% performance gains, especially on Windows-based PCs.

The new AI library is designed specifically to run next-generation generative AI models such as FLUX-1.dev on user-level GPUs. It also supports low-precision numerical formats such as FP4 and FP8, making more efficient use of memory and enabling higher throughput.
A notable feature of TensorRT for RTX is that inference engines do not need to be pre-compiled. When the model is launched, compilation can be done on the GPU in a few seconds and model-specific engines can be created.
This, along with additional optimizations, provides an additional 20% increase in performance. Moreover, the system is only 200 MB in size and can be automatically downloaded in the background via Windows ML. In tests conducted on the NVIDIA RTX 5090, it was seen that TensorRT for RTX provided a significant speed increase compared to DirectML.
So what do you think about this new technology? You can easily share your thoughts with us in the comments section below.