• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Effect of SLC Caching on SSD Endurance

Joined
Dec 17, 2011
Messages
364 (0.07/day)

The Kingston KC3000 has 2000 GB of TLC cache. It can use almost all of it (1930 GB) in SLC mode for 1930/3 = 643 GB.

I keep wondering though. Isn't this writing to the NAND twice? Say I write 100 GB. First I consume 300 GB worth of NAND when writing in SLC mode. Then I consume 100 GB worth NAND in TLC mode. Of course, writing in SLC isn't nearly as harmful but it is somewhat harmful isn't it? We don't even get to choose if we are willing to let go of this SLC caching for better endurance.

@Chris_Ramseyer Can you offer some insights into how harmful (or harmless) SLC caching is to NAND endurance?

@W1zzard Can you offer some insights to this?
 
@Maxx Ok, that actually makes sense (I haven't gone through the documents yet).
And now the next question: if static cache can improve endurance, by how much does it do so?
 
@Maxx Ok, that actually makes sense (I haven't gone through the documents yet).
And now the next question: if static cache can improve endurance, by how much does it do so?

It's effectively two arguments anyway, between actual and measured endurance. Which is to say, the implication is that dynamic pSLC is treated effectively as native/TLC even though there will still be endurance improvements (effectively). It's also oriented at consumer rather than enterprise workloads, similar to the differences by JEDEC. There's a lot to it without even considering the technical aspects at a lower level; I only mention them to illustrate that all flash is not equal even within a singular die, on top of other issues like write amplification.

Two drives you can compare are the FuzeDrive, which has static pSLC + QLC, and the T-Create Expert, which is dynamic pSLC with industrial TLC. The endurance document for the FuzeDrive demonstrates that they take the static pSLC at 30K and native QLC at 600 PEC. The 2TB (of flash) model has 137.44GB of static pSLC (549.76GB of QLC) and 1462.73GB of native QLC (total of 2012.49GB). With all in QLC mode at 600, this is ~1.2PB of writes. Calculating it with static pSLC comes out to 5PB. The T-Create Expert, conversely, uses flash that's rated for 10K PEC in TLC mode but utilizes dynamic pSLC; it's warranty is only 6PB per 1TB capacity (6000 PEC equivalent).

Static pSLC in most consumer drives is limited, e.g. ~12GB in a 1TB model. However you can calculate it using a typical B17A value of 1500/30000 with pSLC taking up 3 times the space to see the improvement isn't huge in direct terms (and again, must be balanced against other factors like OP), however be mindful also that SLC writes in that case do not count towards general NAND writes as found on dynamic-only drives and therefore can have a <1.0 WAF. However, most drives are moving towards a hybrid (static + dynamic) structure as with Samsung's TurboWrite; in that case, it writes to static first, then dynamic, and empties as such (FIFO) but there are complex algorithms for others like the P5/P5 Plus. Nevertheless if you read up on TW you will see they do static first to improve endurance. (more complicated algorithms will send workloads to the different zones - i.e. static pSLC versus native - based on their anticipated WAF, among other things, and I do have patents for that; additionally, controllers have algorithms for this even with dynamic pSLC - I have some from Phison - and moreover they can shift the size of zones as per SanDisk patent above but more recently Micron patents for the P5/P5 Plus proprietary controllers)
 
Last edited:
With all in QLC mode at 600, this is ~1.2PB of writes. Calculating it with static pSLC comes out to 5PB.

most drives are moving towards a hybrid (static + dynamic) structure as with Samsung's TurboWrite; in that case, it writes to static first, then dynamic, and empties as such (FIFO)

Sorry if I sound like a broken record but the Fuzedrive has 5 TB endurance because it is not flushing the SLC writes to TLC. Samsung’s TurboWrite would eventually have to empty its static SLC cache. Won’t this negate any endurance benefit?
 
Eventually even data on the static pSLC gets moved to the TLC blocks, as @Maxx said, some data might be retained longer in order to improve performance in read request operations.
 
Eventually even data on the static pSLC gets moved to the TLC blocks, as @Maxx said, some data might be retained longer in order to improve performance in read request operations.
Reads aren't a bottleneck for SSDs, writes are.
 
Reads aren't a bottleneck for SSDs, writes are.
Agreed. It is beneficial to have greater read speed (Intel P5800X Optane SSD) but it is the slow direct-to-TLC write speed that bothers SSD makers and makes them want SLC caching solutions in their products (which ends up harming endurance in the case of dynamic SLC caching).
 
Sorry if I sound like a broken record but the Fuzedrive has 5 TB endurance because it is not flushing the SLC writes to TLC. Samsung’s TurboWrite would eventually have to empty its static SLC cache. Won’t this negate any endurance benefit?
The QLC portion of the FuzeDrive does have dynamic pSLC, but the SLC portion is effectively static pSLC (there's also Chia drives from QLC as static pSLC, e.g. 8TB -> 2TB). I've disassembled Enmotus's driver (which is required for proper use) and it uses a table structure to determine where data (via blocks) goes, but that doesn't mean data STAYS in one area of the other. As per AnandTech: "The host system sees device with one pool of storage, but the first 24GB or 128GB of logical block addresses are mapped to the SLC part of the drive and the rest is the QLC portion. The Enmotus FuzeDrive software abstracts over this to move data in and out of the SLC portion." To be fair, there are more considerations with a drive like this. (I believe more generally the drive is rated for 3.6PB of writes)

Their endurance calculation in the document is simply total of SLC section times capacity plus total of QLC section times capacity which is an accurate representation of static pSLC as talked about above because as I mentioned (and supported with a patent link), flash endurance is the worst of both zones such that a controller may balance writes to ensure maximum endurance, e.g. 5PB in this case. Indirectly you also have workload placement - which I mentioned, as an example, with random writes going to SLC and sequential to native, so that the higher WAF hits SLC - and other criteria (I've asked them about this, and they actually have more than a few considerations) here, important because utilizing SLC is not just about performance.

As for Samsung: again, static SLC never changes to native so doesn't have the additive wear. It's also a way to defer writes which can reduce write amplification, as with dynamic pSLC, although being in dedicated OP space it has a bit more flexibility (being mindful that it should be using the "best" cells/blocks). Good examples of utilization are static-only drives, like the SN550 and SN750, and hybrid caching drives with QLC, like Intel's 660p/665p/670p series.

Reads aren't a bottleneck for SSDs, writes are.
He's saying that data often languishes in SLC cache on consumer drives because consumer use is read-heavy but, yes, it's also done to defer writes which can reduce total wear. Patents showing this decision (via algorithm) indicate usually user/boot data remains in a SLC mode, for example to improve OS boot times, even though the difference is not huge. Other data is always or almost always stored in SLC, for example metadata including what is mirrored in DRAM, specifically to improve performance, although there tends to be the most benefit with writes. Read disturb and data decay (stale data) are growing issues even if not significant for consumer use and pSLC is less impacted.

Agreed. It is beneficial to have greater read speed (Intel P5800X Optane SSD) but it is the slow direct-to-TLC write speed that bothers SSD makers and makes them want SLC caching solutions in their products (which ends up harming endurance in the case of dynamic SLC caching).
The SLC mode has many benefits, including some listed above in this larger reply. For example, SLC is much less prone to data-in-flight errors from power loss. It also ensures protection when data is moved to native flash, and also improves performance because you can bypass ECC with copyback (folding). Write amplification is reduced since copyback is sequential which is one reason some patents divide workloads by type, e.g. random writes to SLC, which takes ECC into consideration also based on data type (for efficiency purposes if nothing else). Which is to say there are many reasons SSD makers use a SLC mode, it's not just write performance, and pSLC is faster in tR as well (and is therefore often used for metadata). There are reasons to prefer direct-to-TLC such as heat generation but actually it seems a lot of manufacturers artificially limit native flash speeds these days, for example with the new SN550 (or TLC SN530), Samsung's 870 QVO, the P5 Plus, etc. Which certainly does indicate they are keen on being SLC-reliant. The endurance ramifications are more challenging to fully describe, although I don't think flash wear-out is a serious concern in any case here.
 
Last edited:
Back
Top