• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Microsoft Releases DirectStorage 1.2 with HDD Speedups

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,775 (7.41/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
Microsoft released a major update to DirectStorage, the API that promises to reduce game loading times. The new DirectStorage 1.2 adds the ability to speed up game loading for mechanical hard drives, a feature game developers requested from Microsoft. DirectStorage brings much of the storage sub-system secret sauce of consoles over to PC, and consoles have held on to mechanical HDDs as game storage devices longer than mainstream gaming PCs.

HDDs require buffered reads to compensate for the longer seek times, whereas DirectStorage traditionally accesses files in unbuffered mode, which disqualified HDDs for DirectStorage. With this update, HDDs can take advantage of DirectStorage, wherein game data stored on them is directly accessed by GPUs, and compressed game assets are decompressed on the fly through the compute-shader acceleration capabilities of modern GPUs.



Microsoft also added a means for a game to know whether compressed assets are being decompressed by the GPU, or whether a software (CPU) fallback is engaged for reasons such as incompatible compression/file format. This feedback mechanism allows the game to adjust its asset quality (such as texture resolution), to compensate for the reduced decompression performance.

Microsoft has progressively relaxed the hardware requirements for DirectStorage with each major release. It was originally restricted to NVMe SSDs as the storage device, but was extended to AHCI devices such as SATA SSDs, and now with this release, support is extended to mechanical HDDs.

Many Thanks to TumbleGeorge for the tip!

View at TechPowerUp Main Site | Source
 
Do developers need to update their games?
 
And we're going back to pre-caching like the engines of old, instead of textures streaming
 
This new feature is especially for better presence of HDD games loading. I think that there is some modernization of as you mention the previous way of masking the flow of data through the buffer.
 
HDDs can take advantage of DirectStorage, wherein game data stored on them is directly accessed by GPUs
This is contradicted literally a line below with your own graph. Direct storage does not bypass the ram when uploading data into the gpu mem (yet).

And we're going back to pre-caching like the engines of old, instead of textures streaming
I think you misunderstood the news, it's about how direct storage manages the queues of reads that it gets, not about streamed data flow.
 
Edited for clarity.

DirectStorage v1.0 and v1.1 work with HDD's, the problem was that commands weren't buffered so the order of operations couldn't be optimized to minimize Seek Time.

* Read Sector 7
* Read Sector 7049
* Read Sector 9
* Read Sector 14
* Read Sector 3

Before v1.2, DirectStorage would process them in the order it received them in. With v1.2's Buffered mode it can re-organize the commands.

* Read Sector 3
* Read Sector 7
* Read Sector 9
* Read Sector 14
* Read Sector 7049

Windows has been able to buffer HDD activity for a very long time now (I think Windows 95 introduced it?) while SATA III introduced the ability of HDD's performing their own re-ordering of commands (NCQ) to minimize HDD seeking (this requires a buffer to simultaneously hold multiple commands in so it can juggle around their priority). With SSD's the Seek time is the same regardless of where data is physically located, which is why they don't need buffering.

I imagine NCQ also reduced the wear on a HDD's motors.
 
Last edited:
And we're going back to pre-caching like the engines of old, instead of textures streaming
Not sure how you could infer that from this article. The third paragraph notes an added feature that primarily geared towards tex streamers.
Microsoft also added a means for a game to know whether compressed assets are being decompressed by the GPU, or whether a software (CPU) fallback is engaged for reasons such as incompatible compression/file format. This feedback mechanism allows the game to adjust its asset quality (such as texture resolution), to compensate for the reduced decompression performance.
 
ordering commands to minimize HDD seeking and this requires a buffer to simultaneously hold multiple commands in (NCQ, Native Command Queuing). With SSD's the Seek time is the same regardless of where data is physically located, which is why they don't need buffering.
SSDs are very slow on non-queued random read access. Seek time is around 40 us but much less if it's within the same page (16 kilobytes). That's similar to DRAM row/column access, just a thousand times slower. So it makes sense to reorder the read requests to make them sequential more often, on average. Just like in HDDs. The other reason for queueing is to try to activate as many banks as possible at once, and here's where the similarity with a HDD ends.
 
cool. Now I wanna see them game studios to implement this API ASAP. XDD
 
Since it supports Sata SSDs connected via ahci mode, does that mean it also supports SSDs that are in a storage pool via Storage Spaces or a software raid through Intel driver?
 
Interesting, recognition that directstorage is not just about loading times but also shifting processing off the CPU which is a common bottleneck.
 
Interesting, recognition that directstorage is not just about loading times but also shifting processing off the CPU which is a common bottleneck.
I thought that was the entire point of Directstorage was to bypass the CPU or did I miss something?
 
I thought that was the entire point of Directstorage was to bypass the CPU or did I miss something?
It is, but the marketing side of it is the lightning fast loading speeds.
 
There was a time... when Nvidia wanted everything to be done by the graphics card, without a CPU. I even forgot when that was. On the subject, there is no way to completely exclude the CPU from any computer configuration activity. In this case, the aspiration is to have the part of the asset decompression tasks performed in the most demanding part of the GPU. But Direct Storage performs several different tasks simultaneously.
 
So, they're pre-loading some content instead of all of it on the fly?
It made sense to preload some of it regardless

There was a time... when Nvidia wanted everything to be done by the graphics card, without a CPU. I even forgot when that was. On the subject, there is no way to completely exclude the CPU from any computer configuration activity. In this case, the aspiration is to have the part of the asset decompression tasks performed in the most demanding part of the GPU. But Direct Storage performs several different tasks simultaneously.
back when they first introduced programmable shaders, with their dream of servers and enterprise setups using their GPUs instead of CPUs

DirectStorage v1.0 and v1.1 work with HDD's, the problem was that commands weren't buffered so the HDD couldn't optimize the order of operations to minimize Seek Time. The newly added Buffered mode enables the HDD to optimize operations fed to it in order to minimize Seek Time.

Read Sector 7
Read Sector 7049
Read Sector 9
Read Sector 14
Read Sector 3

Before it would process them in the order it received them in, with Buffered mode it can re-organize the commands.

Read Sector 3
Read Sector 7
Read Sector 9
Read Sector 14
Read Sector 7049

SATA III originally introduced the ability of re-ordering commands to minimize HDD seeking and this requires a buffer to simultaneously hold multiple commands in (NCQ, Native Command Queuing). With SSD's the Seek time is the same regardless of where data is physically located, which is why they don't need buffering.
Good description of the change and how it works, a completely logical change at their end - sounds like NCQ or a way to make sure NCQ works correctly

Since it supports Sata SSDs connected via ahci mode, does that mean it also supports SSDs that are in a storage pool via Storage Spaces or a software raid through Intel driver?
any kind of software involved, is going to add a CPU burden - why would you want to run games off such a thing?
 
So, they're pre-loading some content instead of all of it on the fly?
It made sense to preload some of it regardless


back when they first introduced programmable shaders, with their dream of servers and enterprise setups using their GPUs instead of CPUs


Good description of the change and how it works, a completely logical change at their end - sounds like NCQ or a way to make sure NCQ works correctly


any kind of software involved, is going to add a CPU burden - why would you want to run games off such a thing?
I've just now updated my message to clarify some things.
 
While this Is not directly related does anyone know why we don't have variable textures ( like variable shaders) Where the texture assets load in as a percentage of the full asset based on the vram the end user has instead of potatoe graphics with some titles found on the 8 gig vram 3070 by Steve at Hardware Unboxed. Or the textures acting like nanite where they load more efficiently in your direct visual periphery instead of potatoe graphics?
 
I thought that was the entire point of Directstorage was to bypass the CPU or did I miss something?
Also to bypass the system RAM - at least the uncompressed assets don't have to be written to it and read back from it.
 
While this Is not directly related does anyone know why we don't have variable textures ( like variable shaders) Where the texture assets load in as a percentage of the full asset based on the vram the end user has instead of potatoe graphics with some titles found on the 8 gig vram 3070 by Steve at Hardware Unboxed. Or the textures acting like nanite where they load more efficiently in your direct visual periphery instead of potatoe graphics?
That is exactly what we have, it's called mips. This is the basis of texture streaming.
This can be further refined using sampler feedback to upload only parts of a mip in vram but I don't think many games implement this yet.
 
Last edited:
While this Is not directly related does anyone know why we don't have variable textures ( like variable shaders) Where the texture assets load in as a percentage of the full asset based on the vram the end user has instead of potatoe graphics with some titles found on the 8 gig vram 3070 by Steve at Hardware Unboxed. Or the textures acting like nanite where they load more efficiently in your direct visual periphery instead of potatoe graphics?
we do, many technologies exist

what you're imagining would have to real-time compress the high res textures and shrink them down, and that'd be slower than sending the full res ones - hence, pre-compressing them

Consoles lack CPU power for example, so all the textures became really large disk-space wise to avoid any issues with decompression.
 
Back
Top