Microsoft has added Shader Execution Reordering (SER) to the DirectX Agility SDK, standardizing a feature that reorganizes ray-tracing workloads to improve GPU efficiency. This update, part of DXR 1.2, also includes Opacity Micromaps (OMMs), which help by skipping shader work on transparent surfaces.
In performance demonstrations, SER provided significant uplifts, with Intel Arc B-series GPUs showing up to a 90% increase in frames per second. The standardization paves the way for broader hardware-level implementation by Intel and AMD in future GPUs.
The main topics covered are the technical introduction of SER and OMMs, the resulting performance improvements for GPUs, and the implications for future hardware and game development.
Microsoft adds Shader Execution Reordering (SER) in latest DirectX SDK for more efficient ray tracing — Intel Arc B-series GPUs show 90% performance uplift
Ray tracing just got a little smarter.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
You are now subscribed
Your newsletter sign-up was successful
In 2022, Nvidia introduced hardware-level Shader Execution Reordering (SER) with its RTX 40-series GPUs in order to make ray tracing less taxing. Now, it's officially part of DXR 1.2, which is included in the new DirectX Agility SDK (version 1.619). The announcement blog isn't a casual read because of all the technical jargon, so let's break down what this actually means and how it improves performance.
SER basically reduces the per-pixel rendering time in a very intense ray-traced or path-traced scene. Unpredictability is a GPU's worst nightmare, so when rays start to bounce off of surfaces in a very uncontrolled manner, sure, the scene looks good, but it's crippling the silicon. Here, SER slots in and dynamically categorizes all the reflections and light bounces to form cohesion.
It allows the GPU to find patterns across rays, grouping them together to enable better parallel execution. SER works hand-in-hand with Opacity Micromaps (OMMs), the other highlight feature included in DXR 1.2, which saves processing power by telling the GPU not to run a shader when hitting a transparent or translucent surface.
Your graphics card will only shade the visible pixels as the Opacity Micromaps will give it precise hints on what part of the scene needs to be opaque (and what doesn't). So, SER begins by grouping similar ray-traced shaders together, and then the OMMs let it skip the "invisible" ones entirely. Reducing unnecessary shader work simply allows you to maintain more FPS in games, especially in complex scenes.
In a branching blog, Microsoft shows its own demo for SER, where a scene is rendered with and without it. Using SER, Nvidia GPUs saw a 40% boost in performance while some Intel Arc B-series GPUs got up to 90% more FPS. This feature, now being standardized, means we can potentially see Intel and AMD implement their own hardware-level SER in next-gen GPUs.
The last noteworthy inclusion in this SDK update was Shader Model 6.9, which is what actually enables developers to interface with both OMMs and SER. This will make game developers very happy, but it's ultimately up to them to implement these features before a player-facing upgrade is ever seen. To be clear, these features were announced last year but just came out of preview today.
There are a lot more details in the blog that we didn't go over, such as support for Long Vector, 16-bit float operations, and general changes to streamline hardware overhead. Some of them target the poorly optimized games we see today, struggling with anything less than 12 GB of VRAM. It's all early, programmer-focused patchwork for now, but it can translate to real-world improvements soon.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.
Hassam Nasir is a die-hard hardware enthusiast with years of experience as a tech editor and writer, focusing on detailed CPU comparisons and general hardware news. When he’s not working, you’ll find him bending tubes for his ever-evolving custom water-loop gaming rig or benchmarking the latest CPUs and GPUs just for fun.
-
Gururu I don't get it. Why is the Arc mentioned in the title when the article says it won't be implemented until next gen GPUs?Reply -
dmitche31958 Thank you for the summarization from all of us ignorant people whom without wouldn’t understand much if any of this. :).Reply -
TerryLaze Reply
Because it is already boosting performance without being implemented in hardware, so why shouldn't they mention it?!Gururu said:I don't get it. Why is the Arc mentioned in the title when the article says it won't be implemented until next gen GPUs?
Also they mentioned nvidia as well, I'm just guessing but nvidia is probably getting lower improvement because they are already much faster to begin with.
Using SER, Nvidia GPUs saw a 40% boost in performance while some Intel Arc B-series GPUs got up to 90% more FPS. This feature, now being standardized, means we can potentially see Intel and AMD implement their own hardware-level SER in next-gen GPUs.
-
Gururu Reply
Oh wait a minute, so it is working with our games already? That'd be amazing, but I don't know how to verify.TerryLaze said:Because it is already boosting performance without being implemented in hardware, so why shouldn't they mention it?!
Also they mentioned nvidia as well, I'm just guessing but nvidia is probably getting lower improvement because they are already much faster to begin with. -
TerryLaze Reply
Maybe they just tested with a test scene/benchmark and it's not running in games yet, but the point is they did test it and had results.Gururu said:Oh wait a minute, so it is working with our games already? That'd be amazing, but I don't know how to verify. -
edzieba Reply
- It won't work in games until it is implemented.Gururu said:Oh wait a minute, so it is working with our games already? That'd be amazing, but I don't know how to verify.
- Some games may have already implemented it via Nvidia's 'RTX' API
- The same function is now available via the vendor-agnostic DXR API
- SER has been in 'preview' for quite some time
- Nothing was stopping other vendors implementing this in hardware before the API was available in a finalised form (and the only change from the preview to final version was the function to return if a given GPU implemented SER, which is a driver level feature rather than a hardware one)
- With the speedup that Arc sees, it is quite likely Intel did exactly that
AMD dragged their heels for a long time on adding matrix FMA acceleration to their GPUs, and to adding raytracing hardware acceleration to their GPUs. SER is a required part of Shader Model 6.9, but there is nothing forcing AMD to implement Shader Model 6.9 or D3D12_RAYTRACING_TIER_1_2. Note that SER 'support' can technically just be accepting the API calls and doing nothing different, so AMD could also 'implement' SER without any performance improvement just to be able to slap "Shader Model 6.9" support on the box.
Whether an actual functional implementation crops up in RDNA5 or not depends on how forward looking AMD were when drafting the chip spec. -
wakuwaku Reply
How about reading the source? While it is technical, the part that shows a hardware compatibility table isn't that technical:Gururu said:I don't get it. Why is the Arc mentioned in the title when the article says it won't be implemented until next gen GPUs?
https://devblogs.microsoft.com/directx/shader-model-6-9-retail-and-more/#appendix
It's clearly mentioned in the table that ARC B supports SER.
As mentioned above, the table clearly shows what supports SER and the ARC B does. It's not a perf boost without hardware support.TerryLaze said:Because it is already boosting performance without being implemented in hardware, so why shouldn't they mention it?!
Also they mentioned nvidia as well, I'm just guessing but nvidia is probably getting lower improvement because they are already much faster to begin with.
The article clearly says that Microsoft uses its own Demo to show the perf difference. There wasn't a single mention about being implemented in games at all.....Gururu said:Oh wait a minute, so it is working with our games already? That'd be amazing, but I don't know how to verify. -
DS426 Great, though MS is late to the party; Vulkan released their VK_EXT_ray_tracing_invocation_reorder extensions late last year. Phoronix has an article on Vulkan SER, though it's rather short on details; for example, I don't see where they mentioned what GPU had 47% fps gains with the testing that was done. Hopefully they and others (*cough* Toms! *cough*) will provide a deeper dive on this soon.Reply -
palladin9479 Reply
They mentioned multiple things so it's confusing. The new memory optimizations are vendors agnostic and handled at the platform / framework level. The object mapping features require driver support.Gururu said:Oh wait a minute, so it is working with our games already? That'd be amazing, but I don't know how to verify.