The Blackwell RTX 50-series GPUs mark the tip of the greater than two lengthy years of ready for the reason that RTX 40-series Ada Lovelace GPUs launched in late 2022. Nvidia introduced its upcoming GeForce RTX 50-series cards in the course of the CES 2025 keynote, offering the specs, pricing, and even a preview of efficiency. Massive claims have been made, with new applied sciences like DLSS 4 enjoying a significant position in these claims. As the brand new halo half, the RTX 5090 takes over from the RTX 4090, boasting extra reminiscence, extra compute, extra options, and extra energy. It isn’t but accessible, however there’s lots happening that is value dissecting earlier than playing cards go on sale.
Will the RTX 5090 be one of many best graphics cards when it arrives? If by “greatest” you imply “quickest” then sure, there’s little doubt it should surpass it is predecessor. Will or not it’s twice as quick? Relying on the way you need to measure efficiency, possibly, however that is placing lots of belief in AI methods that are not the identical as conventional rendering. Let’s dig into the specs and options that we learn about to debate how the outdated and new kings of the GPU world stack up.
Graphics Card | RTX 5090 | RTX 4090 |
---|---|---|
Structure | GB202 | AD102 |
Course of Node | TSMC 4NP | TSMC 4N |
Transistors (Billion) | 92 | 76.3 |
Die measurement (mm^2) | 744 | 608.4 |
SMs | 170 | 128 |
GPU Shaders | 21760 | 16384 |
Tensor Cores | 680 | 512 |
RT Cores | 170 | 128 |
Increase Clock (MHz) | 2407 | 2520 |
VRAM Velocity (Gbps) | 28 | 21 |
VRAM (GB) | 32 | 24 |
VRAM Bus Width | 512 | 384 |
L2 Cache | 128? | 72 |
Render Output Items | 240? | 176 |
Texture Mapping Items | 680 | 512 |
TFLOPS FP32 (Increase) | 104.8 | 82.6 |
TFLOPS FP16 (INT8 TOPS) | 1676? (3352) | 661 (1321) |
Bandwidth (GB/s) | 1792 | 1008 |
TBP (watts) | 575 | 450 |
Launch Date | Jan 2025 | Oct 2022 |
Launch Value | $1,999 | $1,599 |
Let’s discuss uncooked specs first. The RTX 5090 has 170 Blackwell Streaming Multiprocessors (SMs), in comparison with 128 SMs on the 4090. That is a 33% improve in GPU cores — and the variety of CUDA cores, tensor cores, RT cores, texture models, and many others. is instantly tied to the SM counts, in order that’s mainly a 33% improve general.
Clock speeds additionally play a job, nevertheless, and the 4090 has a 2520 MHz enhance clock in comparison with (primarily based on calculations and Nvidia’s official specs) a 2407 MHz enhance clock. Meaning for uncooked compute, the 5090 ‘solely’ gives a 27% enchancment over the 4090. Nonetheless, that is assuming no different architectural variations exist, which just about definitely is not a very good assumption.
Reminiscence capability, pace, and bandwidth are all larger with the RTX 5090, because of GDDR7 in addition to an even bigger, beefier chip. The RTX 5090 has 33% extra VRAM than the 4090, clocked 33% larger, for a internet 78% enchancment in uncooked bandwidth. We do not know the L2 cache measurement or if there are every other adjustments that might impression bandwidth, and each of these are vital concerns. Nonetheless, that is a giant improve in uncooked reminiscence bandwidth.
Nvidia is betting large on AI with the RTX 50-series, and that is the place we see a number of the greatest adjustments. The RTX 4090 has 661 TFLOPS of FP16 tensor compute (with sparsity), and 1321 TOPS (teraops) of INT8 tensor compute (once more with sparsity). That is way over AMD’s RX 7900 XTX that solely gives 123 TFLOPS / TOPS of FP16 / INT8 compute (with out sparsity). However it nonetheless pales compared to the RTX 5090.
We’re not sure on the FP16 determine, however assuming Nvidia follows the identical ratios because the prior era, the RTX 5090 will ship as much as 1676 TFLOPS of tensor FP16 compute, and double that for 3352 TOPS of tensor INT8 compute (each with sparsity). That is a 154% improve (2.54X) in AI computational efficiency with the brand new era. And Nvidia intends to place the AI potential to good use.
As we have mentioned elsewhere, Nvidia DLSS 4 will leverage the brand new options in Blackwell to energy its AI algorithms. Multi body era will “predict the long run” and generate as much as three further frames from one rendered (and probably upscaled) body. As a result of it is utilizing body projection quite than interpolation, the latency penalty should not be all that totally different from what we have seen already with DLSS 3 body era, however the further frames will make the whole lot look smoother.
How does that truly really feel? We’ve not had an opportunity to check it ourselves, so we’ll withhold any closing judgement, however we’re fairly skeptical. It can most likely work decently, however one rendered body primarily based on person enter adopted by three AI generated frames with no new person enter will not have the identical really feel as a recreation the place each body takes any new person enter and will get totally rendered.
There are different adjustments coming as effectively, nevertheless, some unique to the Blackwell RTX 5090 and others that may work with older RTX playing cards. RTX Neural Supplies seems to make use of AI compression and studying to cut back reminiscence necessities for the textures and materials descriptions utilized in video games by a couple of third. Nonetheless, the {hardware} pipeline wants to have the ability to use AI alongside the shaders to have this work, so it is going to be one other 50-series unique.
DLSS Transformer upscaling however makes use of a newly skilled community constructed off of AI transformers, quite than the convolutional neural community (CNN) used with earlier DLSS upscaling algorithms. Transformers have been on the coronary heart of the AI revolution, energy issues like ChatGPT, DALL-E, and different AI content material mills. The pattern movies Nvidia has proven of outdated versus new DLSS upscaling look very spectacular, and we’re desirous to attempt it out in individual. What’s extra, the brand new DLSS Transformer algorithm apparently runs sooner than the older CNN model, and it is going to be accessible for all RTX GPUs.
Nvidia’s personal efficiency preview, the place it suggests RTX 5090 may be as much as twice as quick because the RTX 4090, additionally present a few video games the place there isn’t any DLSS 4 and even DLSS 3 in a single occasion to muddy the waters. Wanting on the Far Cry 6 outcomes, it seems the 5090 will supply about 27% extra efficiency than the 4090 in video games the place the brand new AI options aren’t a part of the equation. In A Plague Story: Requiem, the hole will increase to about 43% (sure, I am counting pixels!). Whereas within the video games that use DLSS 4 MFG (versus DLSS 3 FG), Nvidia reveals a 2.3X–2.45X enchancment.
Does that imply the RTX 5090 is or is not well worth the larger value? We expect it should largely rely on what you are doing. There’ll virtually definitely be lots of people and firms which can be fascinated about AI who will bounce on the likelihood to pay $1,999 for an RTX 5090. Those self same teams have been shopping for RTX 4090 playing cards for the previous couple of years. In generative AI testing, the 5090 additionally confirmed a large 2X bounce in efficiency utilizing Flux.dev.
However in case you’re principally enjoying video games, and you do not love body era? It is most likely not a foul thought to take a seat again and wait to see how issues develop for a bit. Perhaps DLSS 4 in precise use will appear and feel nice. Or possibly pulling as much as 575W of energy by way of the brand new 16-pin connector will lead to Meltgate Part 2. However nevertheless you slice it, two grand is some huge cash to spend on a gaming GPU — and you may undoubtedly need the remainder of your PC to be as much as the duty, as powering the RTX 5090 and offering a gentle stream of recreation updates will want a really potent PC.