• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Do I need to order a new drive?

The two top drives Backblaze have are the 16000NM001G (Seagate) with only a failure rate of 0.88% and the MG07ACA14TA (Toshiba) 1.42%.

View attachment 313160
There are other drives with a higher failure rate but that almost always corresponds to a smaller sample size. Of the drives that do have a high failure rate and high sample count, you can see that is often accompanied by a high number of drive hours. Western digital has the least drives of any manufacturer on backblazes report and by extension it's impossible to say they are overall more reliable.

There are drives with higher failure rates of course but when you dig down into the data the only thing you can conclude is the reliability of the high sample count SKUs with high drive hours. IMO the number of SKUs for all brands other than Seagate are insufficient to determine average quality. There simply is not enough data to come to a conclusion on those manufacturer's drivers in general.
Nice out-of-context cherry-pick. Now read the whole report, or at least the part that matters, the Lifetime drive failure rates.
Backblaze-Q2-2023-Lifetime-AFR.jpg

Hmm...

But the OP has expressed their perspective and that's that. Hopefully Seagate will honor their warranty..
 
Last edited:
Nice out-of-context cherry-pick. Now read the whole report, or at least the part than matters, the Lifetime drive failure rates.
View attachment 313206
Hmm...

But the OP has expressed their perspective and that's that. Hopefully Seagate will honor their warranty..

I think you meant to quote two different comments here as the first and second half seem addressed to different people. If it was address at me, the provided chart does not change any of my prior drawn conclusions.
 
I think you meant to quote two different comments here as the first and second half seem addressed to different people.
It seemed like a dog-pile situation, so I responded to you and by inference, the others offering the same or similar criticism.
If it was address at me, the provided chart does not change any of my prior drawn conclusions.
It should. Take a closer look. If we disregard the low-hanging fruit, Seagate drives, by a good margin, have a higher Average Failure Rate(AFR). The averaged percentages are accurate within the context of averaged failure rates. The ratio of drive model/manufacturers involved is not important to the overall conclusion as all drive models are averaged against themselves, in order.

With HSGT, removing the outlier, the AFR is less than 1%
With Toshiba, removing the outlier, the AFR is just above 1%
With WDC, removing the outlier, the AFR is less than 0.5%
With Seagate, removing the outlier, the AFR is above 2%

Those differences may seem small, however, they are statistically significant. Comparatively, Seagates drives are more likely to fail. The numbers prove this conclusion.
 
All manufacturers have occasional bad eggs, I wouldn't throw the whole line under the bus over one sample.

It's a good read.
 
The two top drives Backblaze have are the 16000NM001G (Seagate) with only a failure rate of 0.88% and the MG07ACA14TA (Toshiba) 1.42%.

View attachment 313160
There are other drives with a higher failure rate but that almost always corresponds to a smaller sample size. Of the drives that do have a high failure rate and high sample count, you can see that is often accompanied by a high number of drive hours. Western digital has the least drives of any manufacturer on backblazes report and by extension it's impossible to say they are overall more reliable.

There are drives with higher failure rates of course but when you dig down into the data the only thing you can conclude is the reliability of the high sample count SKUs with high drive hours. IMO the number of SKUs for all brands other than Seagate are insufficient to determine average quality. There simply is not enough data to come to a conclusion on those manufacturer's drivers in general.
There's probably a good reason why Backblaze buy more Seagate and Toshiba models than any other brand; Only Backblaze know the exact weightings but Seagate and Toshiba seem to fit the best specific balance of reliability and cost/TB from this data.

More reliable may not necessarily be better for Backblaze if it costs significantly more - a drive with a 2% failure rate vs 1% failure rate, the costs are 102 drives vs 101 drives per 100 active drives - so failure rates probably don't matter unless they are order-of-magnitude worse.
 
Backblaze buys the cheapest drives they can. That’s why you see so many seagate drives this last quarter. I tend to agree with the bad apples comment.

As someone who has wasted a lot of time defending backblaze’s data, it really only tells you about a specific model. Go back a couple of years and seagate had the lowest AFR, then Toshiba, then WD, then HGST.

Trying to determine which brand is better or worse is pointless, and backblaze will tell you as much if you read their posts, though especially pointless when you compare 14 models/100,000 drives of brand a vs 3 models/25,000 drives of brand b… that’s just a failure of understanding statistics. If you’re going to normalize that’s fine, but even then it’s better not to do so in sum but instead compare drive sizes/platter density.
 
We use Exos almost exclusively for big data. I run a few myself. I see no reason why they wouldnt accept an RMA but as other have mentioned "bad apples" I do want to point out how broad that statment is.

Remember the sizes we are dealing with. Backblaze is a great metric, but be careful comparing to other brands when you look at the amount of disks used. an AFR over 2% means nothing for example when you have 30k more drives.

That said I dont accept degradation of any kind, so I would return it. remember SMART data is delayed, it only detects bad sectors after a relocation or failed write attempt. Even then the controller poll rate may be slower still.

On the same token. 8 bad sectors out of 16tb is a drop in the ocean. So its impact on "reliability" isn't even worth mentioning imo. and I get these by the pallet.

My other big boy brand is HGST when the pricing competes with exos.
 
It seemed like a dog-pile situation, so I responded to you and by inference, the others offering the same or similar criticism.

It should. Take a closer look. If we disregard the low-hanging fruit, Seagate drives, by a good margin, have a higher Average Failure Rate(AFR). The averaged percentages are accurate within the context of averaged failure rates. The ratio of drive model/manufacturers involved is not important to the overall conclusion as all drive models are averaged against themselves, in order.

With HSGT, removing the outlier, the AFR is less than 1%
With Toshiba, removing the outlier, the AFR is just above 1%
With WDC, removing the outlier, the AFR is less than 0.5%
With Seagate, removing the outlier, the AFR is above 2%

Those differences may seem small, however, they are statistically significant. Comparatively, Seagates drives are more likely to fail. The numbers prove this conclusion.

I had some free time so I decided to punch the numbers into a spreadsheet for further investigation.

If you look right below the data provided by backblaze, you can see I tabulated a drive failures per 1,000,00 drive days. As you can see, Seagate is by far the highest here. The problem with this figure and Annual failure rate is it doesn't really tell you how old each drive is on average. This is important as drive failure rate, at least in backblaze's case, increases after 3 years and spikes after 5. I have provided a chart from backblaze below demonstrating this. With this in mind, I thought it would be prudent to figure out on average where each drive would land on that failure curve. Please look at the bottom of my provided graphic and I will continue discussion just below it.

numbers.png

As you can see above, on average backblaze's seagate drives have been in service for longer than your typical 5 year enterprise drive warranty. If you look at the chart provided by backblaze below, it's actually surprising that failure rate isn't higher given the number of service hours these drives have seen. HGST looks the best here as it has an even higher number of drives that have on average been in service longer than the warranty. The WDC drive numbers bear out what I said before, there simply isn't enough of a sample size here and of those we do have most haven't even seen 2 years of service yet. The only thing we could perhaps gleam from the WDC numbers is that at least with those SKUs they do not appear to be shipping out a lot of DOAs from the factory, assuming of course that backblaze even counts those drives towards these numbers (if you know whether they do or not LMK).


1694490785587.png


Ultimately though there is a problem with both backblaze's numbers and my numbers. It's the fact that we don't know the age of each individual drive or the age of each bulk group of drives that backblaze purchases in groups that would all have the same age. This is a problem because it means that large groups of drives could be failing in a single quarter simply because they both have a high service life and are all the same age. This alone can result in spikes in AFRs. It's particularly problematic as this phenomenon is not always observable. It's possible 10,000 drives of a specific SKU are EOL and set to fail within 0 - 2 years yet the total drive service days can be small as backblaze could have possibly purchased 20,000 or more at a later date which makes the total service hours look much lower than a bunch of failing drives would otherwise suggest. If you assume the former drives are past 5 years service and the newer drives only have 1-2 years that could make it seem like the drives have a higher AFR simply because a large amount of older drives are reaching EOL that isn't reflected in the data provided publically.

Of course the data is useful but you have to take these factors into consideration. Given the above and the potential for spikes, I'd say that a 1-2% difference isn't that significant. Backblaze likely has far more rich data than they externally provide that they base their purchasing decisions on.
 
Last edited:
I had some free time so I decided to punch the numbers into a spreadsheet for further investigation.

If you look right below the data provided by backblaze, you can see I tabulated a drive failures per 1,000,00 drive days. As you can see, Seagate is by far the highest here. The problem with this figure and Annual failure rate is it doesn't really tell you how old each drive is on average. This is important as drive failure rate, at least in backblaze's case, increases after 3 years and spikes after 5. I have provided a chart from backblaze below demonstrating this. With this in mind, I thought it would be prudent to figure out on average where each drive would land on that failure curve. Please look at the bottom of my provided graphic and I will continue discussion just below it.

View attachment 313222
As you can see above, on average backblaze's seagate drives have been in service for longer than your typical 5 year enterprise drive warranty. If you look at the chart provided by backblaze below, it's actually surprising that failure rate isn't higher given the number of service hours these drives have seen. HGST looks the best here as it has an even higher number of drives that have on average been in service longer than the warranty. The WDC drive numbers bear out what I said before, there simply isn't enough of a sample size here and of those we do have most haven't even seen 2 years of service yet. The only thing we could perhaps gleam from the WDC numbers is that at least with those SKUs they do not appear to be shipping out a lot of DOAs from the factory, assuming of course that backblaze even counts those drives towards these numbers (if you know whether they do or not LMK).


View attachment 313232

Ultimately though there is a problem with both backblaze's numbers and my numbers. It's the fact that we don't know the age of each individual drive or the age of each bulk group of drives that backblaze purchases in groups that would all have the same age. This is a problem because it means that large groups of drives could be failing in a single quarter simply because they both have a high service life and are all the same age. This alone can result in spikes in AFRs. It's particularly problematic as this phenomenon is not always observable. It's possible 10,000 drives of a specific SKU are EOL and set to fail within 0 - 2 years yet the total drive service days can be small as backblaze could have possibly purchased 20,000 or more at a later date which makes the total service hours look much lower than a bunch of failing drives would otherwise suggest. If you assume the former drives are past 5 years service and the newer drives only have 1-2 years that could make it seem like the drives have a higher AFR simply because a large amount of older drives are reaching EOL that isn't reflected in the data provided publically.

Of course the data is useful but you have to take these factors into consideration. Given the above and the potential for spikes, I'd say that a 1-2% difference isn't that significant. Backblaze likely has far more rich data than they externally provide that they base their purchasing decisions on.
You make some good points and I'll tip my hat to you for going full monty on the numbers. We more or less arrived at the same set of percentages, yet different conclusions. And while I respect your conclusion, we'll have to agree to disagree on what significance the percentages have.
 
Last edited:
Nice out-of-context cherry-pick. Now read the whole report, or at least the part that matters, the Lifetime drive failure rates.
View attachment 313206
Hmm...

But the OP has expressed their perspective and that's that. Hopefully Seagate will honor their warranty..
I manage to get a hold of Seagate and they have started the RMA but their Seagate Recovery Software Premium I was unable to activate it just said try again later.

But during the nights manual windows transfer the "Current Pending Sector Count" and "Uncorrectable Sector Count" has from Yellow to Blue in CDI
1694494675196.png
 
But during the nights manual windows transfer the "Current Pending Sector Count" and "Uncorrectable Sector Count" has from Yellow to Blue in CDI
That shouldn't happen. Drives don't magically get better. I think your drive might be fine. CrystalDiscInfo might be having a glitch.

Have you tried any other utilities to check the SMART data? Something like HardwareInfo(https://www.hwinfo.com/) for example?
 
That shouldn't happen. Drives don't magically get better. I think your drive might be fine. CrystalDiscInfo might be having a glitch.

Have you tried any other utilities to check the SMART data? Something like HardwareInfo(https://www.hwinfo.com/) for example?
You mean:
1694495357672.png


I haven't run the full size before only use "Sensors only" for temp measurements.

Plus Seagates tools activated now like a day after I received it in a email that's not really helpful :roll:
Yesterday:
14fc87cd-9e65-4d96-bd19-0392b268995b.jpg

Today:
Screenshot 2023-09-12 070503.jpg

But I did manually without error move all my data from my X16 to my X18 drive no need to retry only speed deeping from 250-270mb/s to 10,xmb/s some times.
 
You mean:
1694495357672.png


I haven't run the full size before only use "Sensors only" for temp measurements.
Yeah, that's the one. I retract my previous statement, your drive clearly has issues. RMA is the correct course of action.

But I did manually without error move all my data from my X16 to my X18 drive no need to retry only speed deeping from 250-270mb/s to 10,xmb/s some times.
Consider yourself lucky! A lot of times when a drive fails(or starts to fail), the user is unable to recover their data.
 
Yeah, that's the one. I retract my previous statement, your drive clearly has issues. RMA is the correct course of action.
I wouldn't trust the drive that does bad and go good again because it's not a bad usb enclosure or sata cabel.

I just think the software activation is a joke it took from one day to the next to be able to activate.

Consider yourself lucky! A lot of times when a drive fails(or starts to fail), the user is unable to recover their data.
I am usually alright getting my data out in time, like last time my 4TB drive was going bad the only data I couldn't save I actually had on another drive so that was fine.
 
I wouldn't trust the drive that does bad and go good again because it's not a bad usb enclosure or sata cabel.
I’d be wary of this, usb to sata introduces another layer that we can’t really predict. Idk of any data regarding these controllers and failure rates

Might try one of these instead of the windows tool.


Idk who said that smart data is trivial with HDDs and therefore you can use any software, but IME that’s not true. A lot of the data is vendor bound
 
Last edited:
Wholeheartedly agree with you. Drive utilities should be included with the drive and be DRM free. Seagate is very foolish about that...
WD do have their own version which just requires a WD drive to be able to work.

I can see with Seagate if it support other brands too which they do support for then maybe this is why.

But one question do I inform Seagate about this new change or let them discover it themselfs?

I am gonna get on chat with them later today saying I have now moved all my data away, and if I need to zero it, it will take over a week that's like way too long o_O
 
That shouldn't happen. Drives don't magically get better. I think your drive might be fine. CrystalDiscInfo might be having a glitch.

Have you tried any other utilities to check the SMART data? Something like HardwareInfo(https://www.hwinfo.com/) for example?
Pending sectors can. It just means sectors went from "being recovered cause failing" to "fully recovered, remapped, and marked as bad"
 
That shouldn't happen. Drives don't magically get better.
It didn’t and nothing is wrong with CDI. All 3 fields changed the total (the stat still yellow) and the current pending and total uncorrectable. Both went blue because they were successfully remapped.

btw you can change the data readout so it’s more human readable.
 
WD do have their own version which just requires a WD drive to be able to work.

I can see with Seagate if it support other brands too which they do support for then maybe this is why.

But one question do I inform Seagate about this new change or let them discover it themselfs?

I am gonna get on chat with them later today saying I have now moved all my data away, and if I need to zero it, it will take over a week that's like way too long o_O

It shouldn't make a difference to them. You can see your number of reallocated sectors increased from the first screenshot to the 2nd as the bad sectors shown in the first photo were successfully remapped.
 
Pending sectors can. It just means sectors went from "being recovered cause failing" to "fully recovered, remapped, and marked as bad"
Ah, that's right. I forgot about that functionality.

It didn’t and nothing is wrong with CDI. All 3 fields changed the total (the stat still yellow) and the current pending and total uncorrectable. Both went blue because they were successfully remapped.
Right, I did retract my statement. With HWInfo showing similar data, it was a safe conclusion that the OP's drive is dying.
 
Last edited:
Anyway it's still good or not doesn't matter, if you have these fault's in the SMART table it's a bad sign. It's a PRE-Failure mechanism to warn you that there is something wrong with the hardware. Ultimately it can work further for some years or even just some month's. It's possible a drive like this just suddenly stops working, and everything is LOST.

Sure you can ignore all warnings from reallocate sector's and so on, it's the same as removing the led from your car's oil Warning. Then there's no problem any more. But Murphy will be on your side, keep that in mind. I am doing business long enough, and have seen people crying because all their wedding or baby picture's where suddenly all lost. I told them to replace the drive, but they said; Hey, it's still working yes? So what can i do, client's are king.
 
Drive letter P:

How many drives do you have on that machine?
Does it matter how many drives I have?
Plus you can look that up in my system specs if you really want to.

My X670 board has 6xsata ports and 3xm.2. so there can only be 9 in total if I don't split up the drives into more than one partition.


Yesterday I was on the Seagate chat again and the agent there got my CDI screenshots of the drive and the person there was fine with them I told him that was what I was managing to capture since Seatool didn't work with the drive and he said that CDI was totally fine.

Now I am just waiting on a label from Seagate to ship my drive in and damn they are slow, plus the place I purchase my X18 drive is looking into why I got a drive that was part of a system since their shop lists it was a brand new drive not a OEM, Whitelist, Recertified, Refurbished or a drive they taken out from a system build.
 
Does it matter how many drives I have?
Not really, he was just curious. Shrek is an inquisitive type of person, he doesn't mean any offense or harm.

Yesterday I was on the Seagate chat again and the agent there got my CDI screenshots of the drive and the person there was fine with them I told him that was what I was managing to capture since Seatool didn't work with the drive and he said that CDI was totally fine.
That's good!
 
Back
Top