Nvidia H100 Benchmarks – Deploy Faster AI Apps - 175B Parameters in 11 Minutes!!

Nvidia reigns supreme – Crushes MLPerf 3.0 Benchmarks – AMD Who🤔

What is MLPerf?

MLPerf is a combination of guides presented by highly advanced AI development papers, practical implementation of said AI models & industries using the AI models.

MLPerf keeps launching new tests as Machine Learning / AI hardware, software & services keep evolving.

These tests can be compared with gaming graphics card benchmarking softwares with the same goal in mind. All of them intend to test the hardware to its fullest capabilities to let the end-user/companies have the best product money can buy.

Diagram showing the variety of tests included in the MLPerf v3.0 benchmark suite.

Nvidia H100 Specs

Product Specifications

Form Factor	H100 SXM	H100 PCIe	H100 NVL¹
FP64	34 teraFLOPS	26 teraFLOPS	68 teraFLOPs
FP64 Tensor Core	67 teraFLOPS	51 teraFLOPS	134 teraFLOPs
FP32	67 teraFLOPS	51 teraFLOPS	134 teraFLOPs
TF32 Tensor Core	989 teraFLOPS²	756 teraFLOPS²	1,979 teraFLOPs²
BFLOAT16 Tensor Core	1,979 teraFLOPS²	1,513 teraFLOPS²	3,958 teraFLOPs²
FP16 Tensor Core	1,979 teraFLOPS²	1,513 teraFLOPS²	3,958 teraFLOPs²
FP8 Tensor Core	3,958 teraFLOPS²	3,026 teraFLOPS²	7,916 teraFLOPs²
INT8 Tensor Core	3,958 TOPS²	3,026 TOPS²	7,916 TOPS²
GPU memory	80GB	80GB	188GB
GPU memory bandwidth	3.35TB/s	2TB/s	7.8TB/s³
Decoders	7 NVDEC 7 JPEG	7 NVDEC 7 JPEG	14 NVDEC 14 JPEG
Max thermal design power (TDP)	Up to 700W (configurable)	300-350W (configurable)	2x 350-400W (configurable)
Multi-Instance GPUs	Up to 7 MIGS @ 10GB each		Up to 14 MIGS @ 12GB each
Form factor	SXM	PCIe dual-slot air-cooled	2x PCIe dual-slot air-cooled
Interconnect	NVLink: 900GB/s PCIe Gen5: 128GB/s	NVLink: 600GB/s PCIe Gen5: 128GB/s	NVLink: 600GB/s PCIe Gen5: 128GB/s
Server options	NVIDIA HGX H100 Partner and NVIDIA-Certified Systems^™ with 4 or 8 GPUs NVIDIA DGX H100 with 8 GPUs	Partner and NVIDIA-Certified Systems with 1–8 GPUs	Partner and NVIDIA-Certified Systems with 2-4 pairs
NVIDIA AI Enterprise	Add-on	Included	Add-on

Nvidia H100 Price – It’s costly – Believe it

Nvidia H100 has been quite elusive since the very start. You can get it from various cloud gpu platforms like Azure, Google Cloud, AWS, Vultr, Lambda Labs etc.

Nvidia H100 80GB model comes for a whopping price of $30,000 or INR 2,461,419.00 (24+ Lacs).

Nvidia H100 188GB model could cost about $60,000 or INR 4,922,838.00 (49+ Lacs), as it is two of them stacked via NV Link.

Nvidia DGX Server containing 8 GPUs is said to cost about $520,000 or INR 42,663,660.00 (4.26+ Cr) with 5 Years of support.

Nvidia's AI Chips Priced Over $40,000 on eBay Amid Surging AI Demand - Gizmochina

Nvidia H100 Benchmarks – As released by MLCommons

MLCommons released the benchmarks of Nvidia H100 & they are staggering. The new GPUs for machine learning and AI by Nvidia dominated every aspect of the tests.

LLM and BERT natural language processing benchmarks were taken on a system developed by Nvida & Inflection AI.

CoreWeave hosted the entire system of GPUs along with related hardware.

Graphic showing NVIDIA's results across a range of MLPerf 3.0 tests.

The LLM benchmark was taken with OpenAI’s GPT-3 LLM. It was trained with 175 billion parameters.

Lambda Labs had quoted that training such a huge LLM requires about 3.14E23 FLOPS of computing power. It is an expensive & time-consuming task which is certainly affected by how many GPUs are connected & individual efficiency of each GPU.

In comparison to other GPU’s that were included in the tests (none from AMD), H100 definitely set amazing records. Nvidia H100 Tensor Core GPU yielded per-accelerator LLM training time of 548 hours (~ 23 days).

Chart showing NVIDIA H100 MLPerf Results across benchmarks.

Now, the Nvidia H100s are not for the consumer side, they are for enterprise level training & inference. So, it is safe to assume that H100s would be used in cluster rather than as a single GPU empowering the training & inference.

Just to test how H100 would perform in a cluster setup, Nvidia & Inflection AI co-developed a GPU cluster on Nvidia H100 Tensor Core GPU. It was again hosted & tested by Coreweave.

Nvidia H100 Tensor Core GPU Cluster

The cluster had 3,584 Nvidia H100 accelerators with 896 4^th gen Intel Xeon Platinum 8462Y+ process (marketed at $5645 or INR 487,730.77). That’s a really hefty combination for maximum workloads.

The OpenAI’s 175B LLM was benchmarked in just 11 minutes & in comparison to that, another Intel based cluster, took about 311 minutes.

Intel’s hardware included 64-96 Intel Xeon Platinum 8380 processors with 256-389 Intel Habana Gaudi2 accelerators.

Graphic showing NVIIDA H100 Results across workloads.

It truly puts things in perspective for the companies which are going to implement Nvidia H100 Tensor Core GPU in their machine learning / AI environment.

Nvidia H100 Cloud GPU – Cost & Companies

Unless you already have a profitable AI company or have good funding to support your growing AI business, upgrading to H100 is going to be quite costly.

As mentioned before in this article – Nvidia H100s are going to cost about $30,000 – $50,000 & considering that several big players are trying to snag them up as soon as possible, the inflation would hit it hard.

You will always have a cloud solution for Nvidia H100 Tensor Core GPU.

CoreWeave will lend you H100 GPUs for $2.23/hour ($1605.6 per month or INR 131724.23).

Vultr is currently displaying Contact Sales, for its H100 GPU offerings, in order to get their hourly rates at the time of this blog.

I’ve created a separate blog post which only has information about Nvidia H100 Cloud GPU providers. It is updated as soon as new cloud GPU providers are found. Checkout the link for H100 Cloud GPU providers.

What’s Next!

I am waiting for MLPerfs benchmarks for AMD’s machine learning GPUs, AMD MI300X with its 192 GB of HBM memory for LLMs.

Till then, you can checkout the Top Open Source LLMs currently available on HuggingFace.

Thanks for reading.

1 thought on “Nvidia H100 Benchmarks – Deploy Faster AI Apps – 175B Parameters in 11 Minutes!!”

Leave a Comment Cancel Reply