As Intel readies its next-generation Xe GPU architecture for launch, the first references of the high-end variant known as the Xe-HP have appeared in a document published by Intel. The Intel Xe-HP GPUs would be powering all the way from enthusiast-class to workstation tier graphics cards as hinted by Intel m393a4k40bb0 cpb previously.
Intel’s Xe-HP ‘High-Performance’ GPU Referenced In Latest Open Source Documents – The Third Enthusiast And Workstation GPU Entrant
We have been hearing a lot about Intel’s Xe GPUs but those are mostly the low power parts derived under the Xe-LP architecture. The Xe-HP ‘High-Performance’ GPUs were spotted by the tech leaker, Komachi (via HardwareLuxx), who found them hidden inside one of Intel’s Open Source Platform documents retaining information about Iris Plus and UHD graphics. The updated adds a ‘new systolic pipeline addition on EU (Execution Uni) from Gen 12 HP onwards’, which are still early additions to the upcoming GPU architecture but it looks like Intel is now moving on from Xe-LP and is shifting gears to high-end graphics.
[Intel] Intel® Iris® Plus Graphics and UHD Graphics Open
Source Programmer’s Reference Manual https://t.co/s6OHQCNUgX
— 比屋定さんの戯れ言@Komachi (@KOMACHI_ENSAKA) April 6, 2020
The specific section of the document can be seen below:
Previously released test drivers mentioned that Intel’s Xe-LP architecture would power the DG1 or Discrete Graphics 1 GPUs while Xe-HP would power the DG2 or Discrete Graphics 2 GPUs. We don’t know much about the DG2 tier of graphics cards except what was mentioned in the test drivers as they also mentioned what seemed to be execution units for each respective part. Following are the variants mentioned in the drivers:
- iDG1LPDEV = “Intel(R) UHD Graphics, Gen12 LP DG1” “gfx-driver-ci-master-2624”
- iDG2HP512 = “Intel(R) UHD Graphics, Gen12 HP DG2” “gfx-driver-ci-master-2624”
- iDG2HP256 = “Intel(R) UHD Graphics, Gen12 HP DG2” “gfx-driver-ci-master-2624”
- iDG2HP128 = “Intel(R) UHD Graphics, Gen12 HP DG2” “gfx-driver-ci-master-2624”
Three parts were mentioned with 128, 256 and 512 EUs (these could also be taken as bus widths, but previous GPUs used this number to refer to the EU count, not bus width). Considering that DG1 with 96 EUs sits at around 2-3 TFLOPs, a 128 EU chip could end up around 4-5 TFLOPs with 256 EUs offering around 5-10 TFLOPs and 512 EUs offering around 10-15 TFLOPs of FP32 compute output if clocks scale really well on the higher-end GPUs.
In our most recent piece, we got to some more info on what the Xe-HP GPUs would look like. It looks like Intel’s GPU design would be close to NVIDIA, at least visually, with several EUs being packed inside a single Tile. It is similar to how NVIDIA arranges several SM units within a GPC (Graphics Processing Cluster). Each Title would consist of 512 EUs and each EU will have 8 cores. Once again, the design of Intel Xe-HP and Xe-LP microarchitectures would be vastly different. It is stated that 512 EUs would be a single-tile GPU and Intel plans on offering up to 4-tile GPUs, I am not sure if that makes a whole lot of sense in the consumer space, but it can work in workstation segments.
Here are the actual EU counts of Intel’s various MCM-based Xe HP GPUs along with estimated core counts and TFLOPs:
- Xe HP (12.5) 1-Tile GPU: 512 EU [Est: 4096 Cores, 12.2 TFLOPs assuming 1.5GHz, 150W]
- Xe HP (12.5) 2-Tile GPU: 1024 EUs [Est: 8192 Cores, 20.48 assuming 1.25 GHz, TFLOPs, 300W]
- Xe HP (12.5) 4-Tile GPU: 2048 EUs [Est: 16,384 Cores, 36 TFLOPs assuming 1.1 GHz, 400W/500W]
A 4-tile GPU would be seriously impressive and that may be why Raja Koduri mentioned it as the ‘Father of All GPUs’ in a recent tweet.
It’s all Xe HP – the team here in @intel Bangalore celebrated crossing a significant milestone on a journey to what would easily be the largest silicon designed in india and amongst the largest anywhere. The team calls it “the baap of all” ? @IntelIndia pic.twitter.com/scBrFFmhtl
— Raja Koduri (@Rajaontheedge) December 5, 2019
Intel’s Xe-HP GPUs may have multiple variants with different die sizes. It’s also noteworthy since Intel’s Ponte Vecchio GPU which is based on the Xe-HPC architecture would actually utilize an MCM design and would feature several GPU clusters within it that are surrounded by large density HBM packages.
Intel will be manufacturing their Xe HPC class GPUs on the latest 7nm process node. This is also the lead 7nm product that Intel has talked about previously. Intel would make full use of their new and enhanced packaging technologies such as Forveros and EMIB interconnects to develop the next exascale GPUs. Just in terms of process optimizations, following are the few key improvements that Intel has announced for their 7nm process node over 10nm:
- 2x density scaling vs 10nm
- Planned intra-node optimizations
- 4x reduction in design rules
- Next-Gen Foveros & EMIB Packaging
The Ponte Vecchio GPU based on the Xe-HPC microarchitecture would scale to 1000s of EUs, so it’s very likely that a variation of either Xe-LP or Xe-HP would be used with a low-tier Tile configuration and then featured inside the Ponte Vecchio MCM package. All of this would bring a third entrant in the Enthusiast and Workstation GPU space, which is currently made up of AMD and NVIDIA GPUs.
It looks like that Intel’s Xe-HP Enthusiast and Workstation graphics cards would go up against AMD’s RDNA2/CDNA2 and NVIDIA’s Ampere GPUs, which are expected to be announced later this year. Intel would be unveiling its Xe-LP powered stuff with Tiger Lake CPUs first and then move on to the discrete graphics cards offerings for consumer and workstation market and finally enter the exascale race by 2021.