NVIDIA today announced a trio of upcoming GeForce RTX 3000 series cards based on the company’s Ampere architecture. As we have seen on the data center side, Ampere is a huge generational leap over the Pascal generation (A100 v. P100 and RTX 3000 v. GTX 1000.) What we found going through the announcement is that the new cards will have some significant impacts on those looking to build denser GPU solutions. NVIDIA even had a hidden gem in its presentation around RTX IO which looks a lot like GPUDirect Storage type technology for the desktop.
NVIDIA Ampere for 2nd Generation GeForce RTX
The new NVIDIA Ampere generation GeForce RTX GPUs are based on a Samsung 8nm process. We discussed some of the big architectural bits in our data center focused piece on the NVIDIA A100. The desktop GPUs are not optimized for features such as double-precision floating-point, but they offer a generational leap over NVIDIA RTX Turing. Not on this slide, but NVIDIA is supporting PCIe Gen4 with this generation of GPUs as well.
NVIDIA also says that the net impact of the new architecture is between a 1.7X to 2.7X improvement over Turing in terms of classic shaders, RT cores, and Tensor Cores. For most of the workstation market, the important feature here is that the Tensor Core and Shaders are on the higher end of that spectrum with a 2.7x claimed improvement.
NVIDIA also needed to get faster memory to feed the GPUs without using HBM. As a result, they are using PAM4 G6X memory instead of the GDDR6 memory found on Turing. For those that like to see signaling eye charts, that is how NVIDIA is showing four values per cycle versus two levels per cycle NRZ. We commonly see these charts today when discussing networking and transceivers on components such as FPGAs and switch chips.
The other important detail is that these GPUs are going to likely use more power. NVIDIA needed to significantly update the cooling on these in order to cool the GPUs in a standard case. This is similar to how we saw Ampere on the data center side move up to a 400W TDP from 350W on Volta.
We are going to take a look at the trio of cards next, but the key impact is simple: space. Multiple GPU setups are going to be much less useful with enormous triple-slot coolers. Likewise, the big GeForce RTX 3090 will not fit into many existing 4U servers for those who want to rackmount their GeForce workstations.
NVIDIA GeForce RTX 3090
The NVIDIA GeForce RTX 3090 is absolutely huge going well beyond what we have seen from traditional GPU designs. NVIDIA is touting its new cooler design. Make no mistake, if you are building a workstation for Ampere today, get something big.
NVIDIA is offering what is a Titan RTX level of 24GB of memory on this large card. Along with 36 shader TFLOPS, 69 RT TFLOPS (for ray tracing), and 285 Tensor Core TFLOPS for AI.
Big GPU is going to find its way to a lot of STH readers’ machines in the near future. We expect many are going to look for alternative cooling to get them into multiple GPU solutions as well. The expected availability is September 24, 2020 at $1,499.
NVIDIA GeForce RTX 3080
We are covering these out of order compared to how NVIDIA presented them. The NVIDIA GeForce RTX 3080 NVIDIA claims is twice that of the RTX 2080 (first generation and not the NVIDIA GeForce RTX 2080 Super we reviewed as well.)
NVIDIA puts the GeForce 3080 at $699 available on September 17, 2020. This GPU is set to have 10GB of G6X memory, 30 Shader TFLOPS, 58 RT TFLOPS, and 238 Tensor TFLOPS.
NVIDIA says the cooler on the RTX 3080 is 3x quieter and more efficient than the RTX 2080 design so one should, for about the price of a RTX 2080 get a faster and quieter GPU.
NVIDIA GeForce RTX 3070
The GeForce RTX 3070 scales this down a bit to 20 shader TFLOPS, 40 RT TFLOPS, and 163 Tensor TFLOPS at $499.
We also get a more traditional cooler design and 8GB of G6. That is notably not “G6X” memory so we would expect lower bandwidth on the memory side as well.
NVIDIA is still showing a large generational improvement over the previous-gen. The company said this would be available in October, but without a specific date.
NVIDIA RTX IO the Hidden Gem
Perhaps the hidden gem in this announcement is NVIDIA RTX IO.
Effectively as games become bigger, NVIDIA is thinking about how it moves data from storage to GPU memory. The diagram NVIDIA used was very telling. Storage did not sit directly on the PCIe bus like a Gen4 SSD. Instead, NVIDIA is showing what almost looks like a GPUDirect storage solution to the workstation space.
What is interesting is that NVIDIA is specifically showing the flow through a NIC in their diagram, especially interesting given the NVIDIA-Mellanox Acquisition. For those wondering, NVIDIA has a GPUDirect Storage diagram without going through the NIC as well.
We wonder if this is a foreshadowing of NVIDIA moving into the high-performance networking segment for not just the traditional data center as well. NVIDIA did not say “GPUDirect Storage” and it was focused on compression offload, but the flow is very familiar to what is happening on the data side.
This is a big competitive challenge for AMD. While AMD may be implementing hardware ray tracing, and exaflop supercomputers such as El Capitan have developed efforts to bolster AMD GPU compute ecosystems, NVIDIA is a step (or more) ahead here. NVIDIA is using its Selene supercomputer to train models that are then being used by its AI-infused products even down to rendering on the desktop. Likewise, as Intel moves ahead with Intel Xe GPUs, it is focusing on capabilities while NVIDIA is showing hardware capabilities implemented and backed by trained models that are using in its products. That may seem like a small difference, but NVIDIA is pulling ahead on the AI implementation. What we are seeing with features such as RTX IO is that NVIDIA is looking to make CPUs from Intel and AMD even less relevant by moving data flows away from them.
Many STH readers (and the team at STH) are rightfully going to be excited about new hardware, especially if there is a big bump in generational performance. We will have our normal workstation-focused (non-gaming) reviews with this generation as well. Stay tuned as these new NVIDIA RTX 3000 series cards arrive.
NVIDIA GeForce RTX 3080 Ti & RTX 3080 Ampere Gaming Graphics Cards Launch Rumored For 17th September – RTX 3070 in October & RTX 3060 in November
NVIDIA GeForce RTX 30 Ampere Gaming Graphics Cards, RTX 3080 Ti & RTX 3080, To Be Introduced on 9th September
Launching ‘This Week: NVIDIA’s GeForce RTX3060 Ti, A Smaller Bite of Ampere For $400
NVIDIA Tesla V100 Volta Update at Hot Chips 2017