NVIDIA GTC: New AI Inference Records, Customers and Edge Solutions

The News: NVIDIA today posted the fastest results on new benchmarks measuring the performance of AI inference workloads in data centers and at the edge — building on the company’s equally strong position in recent benchmarks measuring AI training.

The results of the industry’s first independent suite of AI benchmarks for inference, called MLPerf Inference 0.5, demonstrate the performance of NVIDIA TuringTM GPUs for data centers and NVIDIA XavierTM system-on-a-chip for edge computing.

MLPerf’s five inference benchmarks — applied across a range of form factors and four inferencing scenarios — cover such established AI applications as image classification, object detection and translation. Read the full press release on NVIDIA.

Analyst Take: NVIDIA has long been reputed as the leader in AI Training with its GPU technology, but over the past few years there has been some question marks around the best technology for doing inference workloads (a category deemed to be more than 10x the size of training); specifically the idea that CPU technology would be the preferred technology for inference.

With these announcements of record breaking performance, NVIDIA continues to show that GPU technology, coupled with the company’s deep domain expertise in software and developing high performance models, has the potential to emerge the leading hardware architecture for Inference. It’s perhaps starting to look less promising for CPU and more so for GPU, much to NVIDIA’s liking…

Other GTC Notes – USPS and Jetson Xavier NX

NVIDIA also had a few other announcements at GTC that caught my interest in particular. The first was a partnership formed with USPS. What I liked most about this announcement was it migrated beyond the typical speeds and feeds of the “Benchmarks” mentioned above and gets into real world applications where AI technology can change lives.

In commentary provided by NVIDIA’s Ian Buck, he said, “The USPS will roll out a deep learning solution based on NVIDIA EGX to 200 processing facilities that should be operational in 2020.”

Buck also implied the outcome will be the ability to process packages/mail 10x faster than today with greater accuracy. This is the exact type of application where training, and edge inference can deliver business outcomes.

The other announcement that caught my attention was Jetson Xavier NX, which is a new supercomputer for handling AI workloads at the edge.

From a specification perspective the company shared that Jetson Xavier NX delivers up to 14 TOPS (at 10W) or 21 TOPS (at 15W), running multiple neural networks in parallel and processing data from multiple high-resolution sensors simultaneously in a Nano form factor (70x45mm). Also, for companies already building embedded machines, Jetson Xavier NX runs on the same CUDA-X AI™ software architecture as all Jetson offerings, ensuring rapid time to market and low development costs.

I am a firm believer that inference at the edge is going to be a substantial growth engine for AI as the ability to stream and process workloads at the edge will be critical to supporting modern architecture and managing the data deluge effectively.

Traditional compute hardware won’t work well based on the ruggedized and/or size limitations of many edge applications and Jetson Xavier NX is designed to enable the powerful compute requirements in those environments. This announcement is an important one for NVIDIA and as I mentioned above, it further exemplifies how the company is employing its technology to handle the growth of AI Inference.

Overall, the week was packed with important announcements for NVIDIA. The company continues to focus on leveraging its expertise in AI to utilize its hardware and software stack to solve the inference challenge at scale. It will be important and exciting to watch these developments continue and for more real world examples to be shared in future months.

Futurum Research provides industry research and analysis. These columns are for educational purposes only and should not be considered in any way investment advice.

Read more analysis from Futurum Research:

Imperva Data Breach Has Consequences for CEO

Cisco Webex Announces Interop Partnership With Microsoft Teams

Microsoft Ignite 2019: A Full Stack Approach to Tech Intensity

Image: NVIDIA

The original version of this article was first published on Futurum Research.

Daniel Newman

Daniel Newman is the Principal Analyst of Futurum Research and the CEO of Broadsuite Media Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise. From Big Data to IoT to Cloud Computing, Newman makes the connections between business, people and tech that are required for companies to benefit most from their technology projects, which leads to his ideas regularly being cited in CIO.Com, CIO Review and hundreds of other sites across the world. A 5x Best Selling Author including his most recent “Building Dragons: Digital Transformation in the Experience Economy,” Daniel is also a Forbes, Entrepreneur and Huffington Post Contributor. MBA and Graduate Adjunct Professor, Daniel Newman is a Chicago Native and his speaking takes him around the world each year as he shares his vision of the role technology will play in our future.

Read more analysis from Futurum Research:

Image: NVIDIA

Daniel Newman

Leave a Comment Cancel reply