By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.

Boosting Performance: A Comparative Analysis of NVIDIA Riva 2.1.0 vs Riva 1.10.0

Vladimir Nechaev

In a recent milestone, NVIDIA unveiled Riva Speech Skills release 2.1.0, packed with new features and enhancements (full release notes can be found here)., including a noteworthy improvement in Conformer ASR latency and throughput. This caught our attention at Data Monsters, as we've been utilizing the Conformer CTC model in streaming mode for one of our ongoing projects, where performance is of paramount importance. Eager to explore the possibilities, we conducted a series of rigorous tests to assess the performance gains.

Fig. 1: Unleashing the Power - Conformer-CTC Latency vs Throughput

Above, you can witness the results of our performance comparison between Riva 2.10 and Riva 1.10.0-beta versions of the Conformer-CTC model in streaming mode, meticulously measured on a single Tesla V100 GPU. To ensure consistency, we employed a pretrained model with default riva-build options.

The models were constructed using Riva 1.10.0-beta and Riva 2.1.0, with varying configurations: 'low_latency', 'intermediate', and 'high_throughput,' denoting chunk sizes of 160, 400, and 800, respectively. The graph eloquently illustrates the performance improvements across each configuration, with a more pronounced impact on the effective number of streams.

For up to approximately 20 streams, there is minimal discernible difference. However, as the number of streams increases, a noticeable reduction in latency becomes apparent in Riva 2.1.0, holding steady with the same RTFX and stream count. Take, for instance, the high-throughput model with 128 audio streams, which performs approximately 100 milliseconds faster in Riva 2.1.0 than its Riva 1.10.0-beta counterpart with an equivalent stream count.

If your workload involves 20 or more concurrent streams, the upgrade to Riva 2.1.0 promises tangible performance improvements. However, it's essential to consider other changes accompanying the upgrade when making your decision. Rest assured, the future of accelerated speech processing is bright with NVIDIA Riva at the helm.

Remember, this is just a glimpse into the exciting world of performance enhancements offered by NVIDIA Riva 2.1.0. Explore the comprehensive release notes and dive into the multitude of possibilities that lie ahead.

Written by Anna Mosolova, Vladimir Nechaev, and Marina Molchanova - Data Scientists at Data Monsters.

Latest articles

Browse all
July 12, 2023

Join us July 18th - Empowering Your Product with Voice Interface Workshop

Join us for an immersive workshop, where you'll learn how to unleash the potential of your product by incorporating a voice...

Read
February 9, 2024

Real Results, Genuine Reviews: Data Monsters’ AI 3D Avatar Reinvents HPE Customer Interaction

Discover the revolutionary technology powering our AI 3D Avatar, developed in partnership with HPE. Morad Qutqut from HPE share...

Read