• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Samsung 870 EVO - Beware, certain batches prone to failure!

199= sus

You shouldn't be having ATA resetting issues.

Looks like more faulty batches, just not the exact same symptoms.

A test for this:

Use HD Tune trial and do random seeks and multiple writes and see if it hangs. (takes an absurdly amount of time to complete) If it does that, then it's done for!
If it hangs, check the event log for multiple messages from Windows about storahci having a problem with the SATA communication.

Same if the test hung with Crystal Disk Mark.
 
Last edited:
I have an 870 EVO 2 TB (non-system drive) purchased in March 2021, with a January 2021 manufacturing date. A few days ago I noticed a bunch of bad blocks after running the full scan in the Magician tool. It didn't affect many actual files according to chkdsk so I have no clue how long those bad blocks have been there. None of the SMART tests would run due to the existing errors. After tinkering with it for a few days I finally managed to get backup off of it using Macrium Reflect. Today I decided to wipe it using Diskpart Clean. Recreated the volume, did the format. Re-ran all the Magician tests and none came back with errors. I am on the SVT02B6Q firmware. Is it still worthwhile to RMA the drive?

I also had my 2TB 870 Evo fail on me a while ago with LBA access error. It only had 4,1 TBW. All extended SMART tests would fail.

Drive info: Samsung SSD 870 EVO 2TB
Serial: S621NF0R****
FW: SVT01B6Q
TBW: 4,1

I have upgraded the firmware to the fixed SVT02B6Q version. However this did not fix the issue with the failing SMART tests immediately. Then I did a secure erase to make sure the controller can start from scratch mapping blocks. After that I did a full f3write/f3read cycle to make sure all blocks could be read properly. This was successful!

From now on all extended SMART tests are passing again.

So I am pretty sure that the firmware update actually fixes the issue. Reading a certain flash cell's state and/or determining it's health might not have been implemented ideally on the initial firmware, maybe due to new flash technology. So it could just have been a threshold (maybe something like cell charge?) which had not been implemented correctly on the old firmware. Who knows...

If you are affected I'd definitely suggest to issue a secure erase command after upgrading the firmware.
Same thing for me except I couldn't get the secure erase to work, even after setting BIOS to legacy vs. UEFI. It would boot up to Linux but quickly end up with a black screen w/o any activity. Finally wiped the drive using diskpart clean. After that I was able to run all the Magician tests w/o errors. I already initiated the RMA process yesterday but am now hesitating to return the drive.
 
If you saw the boot loader successfully load Linux, but PC crashes shortly after, then I suspect a major compatibility bug with the motherboard, or because of the CPU having an IGP and you have discrete graphics.
 
If you saw the boot loader successfully load Linux, but PC crashes shortly after, then I suspect a major compatibility bug with the motherboard, or because of the CPU having an IGP and you have discrete graphics.
Asus Z97E/USB3.1 MB with I4790K CPU (going on 10 years by now?) - I do have a discrete graphics card.
 
Asus Z97E/USB3.1 MB with I4790K CPU (going on 10 years by now?) - I do have a discrete graphics card.
Are lots of Linux distros black-screen-crashing on you? Did you try Kubuntu? Or any other Ubuntu flavor? I'm trying to first get you onto ones that are usually more compatible with a wide-range of hardware.
(or not so broken that it crashes)

With late distros, you are more likely to get a system crash with GeForce graphics cards (Nvidia) on Intel CPUs with integrated graphics. I saw similar, with my Z490 motherboard and Core i5 10600K with my GeForce GTX 1660 Super. It was fine, after I popped in my ASRock PG Arc A770 8 GB graphics card.

Note that I saw Kubuntu do so well on my socket 1366 rig! (first-gen Core i7!) So, I suspect unexpected behavior on some newer systems, due to a bug.
 
Last edited:
I also have an older 860EVO from 500GB, as you can see, ZERO CRC errors. In normally circumstances this may not happen.
The 860EVO are really ROCK SOLID with very good and strong NAND, they don't make them like this anymore... :(

This one is in the M.2 format but still is SATA connected.

View attachment 357004

And this is my older 860EVO, same as you. The one from me is only some months younger... :)

The 860EVO where better then the 870EVO, i never have seen one that was broken. They don't make NAND like this anymore, at that time it was the best quality you could get.
The best ones where always Made in Korea. I had the most problems with Made in China ones.

View attachment 357005

The 860EVO in M.2 format looks like this, but it's still SATA, not NVMe or PCIE.

View attachment 357009
I had this one fail after a few months of use suddenly. Can't even get to bios when it's connected. After that i've had this 870 that looks ok even though it once gave me errors in short smart test
 

Attachments

  • IMG20250112212308.jpg
    IMG20250112212308.jpg
    892.9 KB · Views: 361
  • Screenshot 2025-01-12 213037.png
    Screenshot 2025-01-12 213037.png
    57.8 KB · Views: 373
I can't see anything wrong with it, it all looks okay. What i do see is that you have a very high count of sudden power off, meaning shutting down computer without logout in windows.

That can make a lot of errors on the disk because it must recover files and try to repair them. Your older SSD is a Pre-Corona SSD, never had something wrong with them... Even now still rock solid with Windows 11 24H2 installed! You should look for bad contacts, or a bad power/Sata cable. Clean the contacts. And make sure to not power off your computer by just cutting the power. If that happens many times, you can get errors over time.

D # 235 Power Recovery Count;

A count of the number of sudden power off cases. If there is a sudden power off, the firmware must recover all of the mapping and user data during the next power on. This is a count of the number of times this has happened.
AFAICT, the POR Recovery Count attribute is recording the number of times that the SSD has had to recover from an unsafe shutdown. That occurs after your OS crashes or your system loses power.

Also check your Reliability history in windows and do a DISM and SFC command to clean your system errors.

Furthermore, it is not uncommon that something fails, GPU and CPU do fail, Mobo's fail, Power supplies Fail... Nothing is perfect, it just happens you have a failed one now in your hand's, that does not mean that all of them are bad. But the 870 disease in the corona period was a different story.
 
Last edited:
I can't see anything wrong with it, it all looks okay. What i do see is that you have a very high count of sudden power off, meaning shutting down computer without logout in windows.

That can make a lot of errors on the disk because it must recover files and try to repair them. Your older SSD is a Pre-Corona SSD, never had something wrong with them... Even now still rock solid with Windows 11 24H2 installed! You should look for bad contacts, or a bad power/Sata cable. Clean the contacts. And make sure to not power off your computer by just cutting the power. If that happens many times, you can get errors over time.

D # 235 Power Recovery Count;

A count of the number of sudden power off cases. If there is a sudden power off, the firmware must recover all of the mapping and user data during the next power on. This is a count of the number of times this has happened.
AFAICT, the POR Recovery Count attribute is recording the number of times that the SSD has had to recover from an unsafe shutdown. That occurs after your OS crashes or your system loses power.

Also check your Reliability history in windows and do a DISM and SFC command to clean your system errors.

Furthermore, it is not uncommon that something fails, GPU and CPU do fail, Mobo's fail, Power supplies Fail... Nothing is perfect, it just happens you have a failed one now in your hand's, that does not mean that all of them are bad. But the 870 disease in the corona period was a different story.
Yeah we have power outages so i really should get a ups. I think that's why a long time ago i had a crucial ssd that wouldn't boot windows anymore and needed reinstall
Anyway i have a ram issue now or something else so no wonder the test showed errors once. Old ram works but newer kit both sticks show errors with memtest.
Started having bluescreens so was trying to find out what's wrong
 
Crucial has stopped making SATA SSD already, what you now can get is stock. But there are other good ones still available out there. I now use Kingston or Silicon Power if needed. You can still buy 870EVO but they are a bit expensive at moment. Also Sandisk SSD are good quality and fast..
 
Last edited:
You can still buy 870EVO but they are a bit expensive at moment.
Are you saying it's now safe to buy them? I need a SATA SSD with DRAM and I've read that MX500's also have their own issues so the 870 Evo is my only option.
 
They are safe to use for more then 2 years already, you really think they will go on selling bad SSD? If you get an old it's a suppliers fault not Samsung. Drives made from end 2022 and later are safe. Just do not accept any old drives from a supplier made before this data. There was a disaster in the Corona period, but Samsung learned fast. AFAIK, ALL drives made by Samsung are now safe to use again.

Just do NOT buy the 870QVO series, QLC Nand is trash, just good for storage and little use.
 
Last edited:
I've started getting some FC errors in my SMART readings. Is that a sign this drive may start dying?

It's a 4TB June 2023 model that came with newer firmware than the affected models. Due to all the news about drive failures, I ensured I bought a model after the failed batches and that it had the latest firmware. I ran a full drive write and wipe twice initially to ensure every LBA could be written to without error.

Since then, I haven't had a single issue. I've written a couple more TB to it through general use. I check the SMART data once in a while, looking for any signs of bad blocks or errors. Everything looked good until a couple weeks ago. After not checking the data for a few months, I opened it up and saw the FC - "Vendor Specific" line had a value of 10. This has always been 0. This raised some concern in me. In the last couple weeks it has increased every few days, and is now at 15. There are no other bad blocks or errors in the SMART data, and I haven't had any system issues, though the drive is used for documents and infrequently accessed files. However I have seen too many posts in this thread about failed drives having large FC counts.

Should I be opening an RMA with Samsung? I see no bad blocks when doing a full drive scan in Magician or other disk management programs.


I attached my current SMART report. Wear levelling count has not increased since I installed the drive and did two full wipes. I'm also aware POR should be lower but this is not something that can be helped when systems hang.

Thanks for any insight!
 

Attachments

  • Samsung.png
    Samsung.png
    64.3 KB · Views: 226
Vendor Specific effectively means that CrystalDiskInfo can't identify what the attribute is. The health of your drive appears perfectly fine.

Older Samsung SSDs didn't even have Vendor Specific.
 
Vendor Specific effectively means that CrystalDiskInfo can't identify what the attribute is. The health of your drive appears perfectly fine.

Older Samsung SSDs didn't even have Vendor Specific.
"Vendor Specific" is also the tag that shows up for FC in Samsung Magician. They don't advertise what it is. However, in post 227 of this thread a definition was posted that FC relates to the number of discovered bad blocks, and the value gets exceedingly high on failed 870s.
Furthermore, this value was 0 for me for over a year, and only very recently has started to consistently increase by 1 every couple days. SMART values that are completely harmless don't typically act like this.

I'm wondering if anyone else has seen "good" 870 EVOs from the newer batches start developing errors after time.
 
I've started getting some FC errors in my SMART readings. Is that a sign this drive may start dying?

It's a 4TB June 2023 model that came with newer firmware than the affected models. Due to all the news about drive failures, I ensured I bought a model after the failed batches and that it had the latest firmware. I ran a full drive write and wipe twice initially to ensure every LBA could be written to without error.

Since then, I haven't had a single issue. I've written a couple more TB to it through general use. I check the SMART data once in a while, looking for any signs of bad blocks or errors. Everything looked good until a couple weeks ago. After not checking the data for a few months, I opened it up and saw the FC - "Vendor Specific" line had a value of 10. This has always been 0. This raised some concern in me. In the last couple weeks it has increased every few days, and is now at 15. There are no other bad blocks or errors in the SMART data, and I haven't had any system issues, though the drive is used for documents and infrequently accessed files. However I have seen too many posts in this thread about failed drives having large FC counts.

Should I be opening an RMA with Samsung? I see no bad blocks when doing a full drive scan in Magician or other disk management programs.


I attached my current SMART report. Wear levelling count has not increased since I installed the drive and did two full wipes. I'm also aware POR should be lower but this is not something that can be helped when systems hang.

Thanks for any insight!

BTW writing the drive twice doesn't really prove anything much (except it seems to basically work), as the flash memory blocks don't have a link to physical drive sectors at all, and a particular drive sector will appear all over the flash array over time, moving around as writes elsewhere take place. It's quite a manufacturer-specific secret but generally Flash Drives run some sort of Log-Structured file system at the lowest level and make that look like a HDD to the outside World by keeping track of which drive sectors map to where on the flash...
 
RMA for what? Your drive is perfectly fine.
Without knowing what FC reads, you can't actually know that. Values don't stay 0 for a year then start increasing. A dying drive, particularly one with a history of bad batches, always starts with some entry that used to read zero and suddenly starts increasing. Along with the fact that members of this forum have found that FC measures a type of bad block readout, I think it is absolutely a case to worry.
 
BTW writing the drive twice doesn't really prove anything much (except it seems to basically work), as the flash memory blocks don't have a link to physical drive sectors at all, and a particular drive sector will appear all over the flash array over time, moving around as writes elsewhere take place. It's quite a manufacturer-specific secret but generally Flash Drives run some sort of Log-Structured file system at the lowest level and make that look like a HDD to the outside World by keeping track of which drive sectors map to where on the flash...
For sure, and I ran the test knowing that. Unfortunately that's the most verification I could run before putting the drive into service. I figured two full writes would be enough to hit nearly every block, as the internal wear levelling system should ensure even coverage.
 
FC doesn't seem to translate into a failure later, so i don't treat it as a warning sign. I have three 870 EVO 4TB, one that's failed once (the one from the first post), which i still use heavily to this day without further issues or deterioration (after letting the drive remap the bad sectors and using the latest firmware). Then two others which never had a problem once, one with FC at 1567, the other at 715, and the previously failed one has FC at 1357. I also have a bunch of 870 QVOs which don't even have the FC value available, but otherwise identical SMART layout.
 
For sure, and I ran the test knowing that. Unfortunately that's the most verification I could run before putting the drive into service. I figured two full writes would be enough to hit nearly every block, as the internal wear levelling system should ensure even coverage.

A full read is generally more telling, because on writes the drive has a chance to transparently reallocate the sector.

When I get a new drive a full read is the first test I do, only then do I write and create filesystems.
 
I have three 870 EVO 4TB, one that's failed once (the one from the first post), which i still use heavily to this day without further issues or deterioration (after letting the drive remap the bad sectors and using the latest firmware).

Same for me. I had one 870 Evo with old FW with a few realocated sectors/blocks. After FW update and secure erase everything has been well ever since. People worry too much about this.

Yes, there clearly has been a bad FW for the drive. It has been fixed by a FW update and after that people should stop worrying.

And you should mostly ignore the raw values. Once the current or worst value falls below the threshold you need to replace the drive. That's about it.
 
FC doesn't seem to translate into a failure later, so i don't treat it as a warning sign. I have three 870 EVO 4TB, one that's failed once (the one from the first post), which i still use heavily to this day without further issues or deterioration (after letting the drive remap the bad sectors and using the latest firmware). Then two others which never had a problem once, one with FC at 1567, the other at 715, and the previously failed one has FC at 1357. I also have a bunch of 870 QVOs which don't even have the FC value available, but otherwise identical SMART layout.
Okay sweet, that's exactly the sort of update I was hoping for. I still wonder what FC measures, but I won't worry about it further.
 
A full read is generally more telling, because on writes the drive has a chance to transparently reallocate the sector.

When I get a new drive a full read is the first test I do, only then do I write and create filesystems.
Ah yeah I should have said a full wipe was using Hard Disk Sentinel to do a wipe+read. As well as 'surface' read scans in other tools, which is my primary go-to. I only did the full wipe because with 8TBW over the two full wipes, I imagined I'd start to see any issues with bad flash or firmware start to restrict a bunch of bad blocks as shown in earlier drives.
 
FC parameter has absolutely nothing to do with bad sectors...

FC", also ID 252 actually means for Samsung SSD: Read ECC Count
Source: PM863a datasheet

Thus nothing to let your sleep off this, happens with many SSD now and then.
So that counter is nothing to worry about, ECC reading errors that could be corrected, same feature as normal HDD have.
Only when it is absurdly high, then surely something in wrong. You can have this for example from a bad SATA cable, dirty contacts and sudden power off cases, Freezing system, Hangups, BSOD and finally a bad driver.

I've started getting some FC errors in my SMART readings. Is that a sign this drive may start dying?

It's a 4TB June 2023 model that came with newer firmware than the affected models. Due to all the news about drive failures, I ensured I bought a model after the failed batches and that it had the latest firmware. I ran a full drive write and wipe twice initially to ensure every LBA could be written to without error.

Since then, I haven't had a single issue. I've written a couple more TB to it through general use. I check the SMART data once in a while, looking for any signs of bad blocks or errors. Everything looked good until a couple weeks ago. After not checking the data for a few months, I opened it up and saw the FC - "Vendor Specific" line had a value of 10. This has always been 0. This raised some concern in me. In the last couple weeks it has increased every few days, and is now at 15. There are no other bad blocks or errors in the SMART data, and I haven't had any system issues, though the drive is used for documents and infrequently accessed files. However I have seen too many posts in this thread about failed drives having large FC counts.

Should I be opening an RMA with Samsung? I see no bad blocks when doing a full drive scan in Magician or other disk management programs.


I attached my current SMART report. Wear levelling count has not increased since I installed the drive and did two full wipes. I'm also aware POR should be lower but this is not something that can be helped when systems hang.

Thanks for any insight!
You have a very high number of sudden power off cases, that the reason for this FC error!

Yes, there clearly has been a bad FW for the drive. It has been fixed by a FW update and after that people should stop worrying.
NO, NO and NO; Your bad blocks are not repaired at all but just replaced by spare blocks!! The defective blocks are still there inside and will be NEVER used any more.

You should have RMA it when you could, but it's too late now... I would never accept a 870EVO with bad blocks, they PROMISE 5 year warranty, FREE of defects, why did you not take it???:banghead:
By Secure Erase it you have deleted the prove you needed for Samsung... I did RMA drives for less then this, i really don't understand some people.
 
Last edited:
NO, NO and NO; Your bad blocks are not repaired at all but just replaced by spare blocks!! The defective blocks are still there inside and will be NEVER used any more.

Well yes these blocks are lost. But that is not a big issue as, just like you've said, SSD drives have quite a bit of spare area. Normally these spare blocks are used transparently - many drives have a "SSD life left" metric for that. In case of Samsung they expose a "wear levelling count" metric and a "used reserve blocks count". Over time these counters in Samsung SSDs will increase

So it's perfectly normal for any SSD drive to remap bad blocks. That's part of the wear leveling algorithms in their controller.

In our particular case of the early 870 Evo firmware, the controller failed detect to dying blocks so it could not reassign them proactively/transparently. These blocks now show up as "reallocated sectors count". Now that has caused actual issues for people (a.k.a. data loss in these blocks). But that is/was an issue within the controller firmware which can and has been fixed by an update.

And BTW: my drive is still in warranty and Samsung denies to swap it because it's not defective. It passes all tests and the bad blocks count did not increase anymore after the FW update - I have written many terrabytes to that drive ever since.
 
Back
Top