HP3000-L Archives

June 2004, Week 2

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
David Kinsell <[log in to unmask]>
Reply To:
David Kinsell <[log in to unmask]>
Date:
Sun, 13 Jun 2004 15:35:22 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (67 lines)
"Bruce Collins" <[log in to unmask]> wrote in message news:YNmdnejE5suy_1rdRVn-jw@fidnet.com...
> Bruce Senn wrote:
>
> So, a rough calculation gets me
>
> Hours in 3 months = 24*90 = 2160 =approx 2000
>
> If I "run" 500 drives for 3 months and have 1 failure I can claim a
> 1,000,000 MTBF
>
> 2000*500/1 = 1000000
>
> ------------------------------------------------------------------------
>
> This is probably how the manufacturers come up with these numbers which look
> good but are looslely based in reality. The problem is that with only one
> failure the confidence limits aren't very good. (If they ran 1 drive for 1
> hour and had zero failures they could claim an infinite MTBF). There would
> be better confidence in this number if for example they ran 500 drives for
> 12 months and had 4 failures.
>
> At the url below, for one failure the factors for 95% confidence for 1
> failure are  0.1795 and 39.4978, so that all you can say is that the MTBF is
> between 179500 and 39,497,800 hours.
>
> If the number was based on 4 failures the 95% confidence interval would be
> between 390,600 and 3,670,200.
>
> http://www.itl.nist.gov/div898/handbook/apr/section4/apr451.htm
>
> As you can see, the numbers are often ballpark figures at best, which has
> something to do with why designing for safety tends to have backups on
> backups, and why Safety engineers where both suspenders and a belt.
>


If you read the introduction on that web page, you'll notice that they
say the information applies to systems which exhibit a fairly constant
failure rate.  Disk drives, which are routinely used until they exhibit
well-known wearout modes, are not modeled at all well by the use of
these techniques.  Disk drive vendors provide them because naive
customers demand to see outrageously big numbers, and make purchase
decisions based on them.  They have no meaning, and are not predictive
of average lifetimes of drives, which is what our OP was interested in.
A drive with a claimed million hour MTBF may or may not be more
reliable than one that claims half a million.

Disk vendors do provide "service life" numbers as an attempt to
provide some guidance as to average lifetimes.  These are generally
quoted at 5 years.  If you look at a Seagate manual for a current SCSI
drive, they say "depot repair or replacement of major parts" is permitted
in order to reach the 5 years.  More importantly, they do warranty those
drives for 5 years.  The warranty on ATA drives was dropped recently
from 3 years down to 1 year.  Does that mean SCSI drives are more
reliable than ATA drives?  No, they just charge a lot more for SCSI
drives and are willing to eat more warranty cost on them.

So the bottom line is cross your fingers and hope some of your drives
make it to 5 years.  If you still believe million hour MTBF numers have
significance, send email because there's a bridge in Brooklyn I'd like to
talk to you about.

-Dave

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2