When one computes availability as a percent of time a system is fully functional, one typically gets a number such as 99.987%. Systems are often coarsely classified by the number of leading nines in this number (here there are three nines). Sometimes one hears vernacular such as “almost four nines”, which would be appropriate for this example.

Going from a system rated, say, three nines, to one of four nines, one nine higher, usually involves considerable effort and expense. The reason is that the expected or average amount of downtime has been reduced by a factor of 10. It is far easier, however to go from two nines to three nines than it is to go from, say, four nines to five nines. It just gets harder to get another factor of 10. The following table for a 7 by 24 system indicates why:

Availability Nines Expected downtime per year
.99 two nines 88 hours
.999 three nines 8.8 hours
.9999 four nines 53 minutes
.99999 five nines 5.3 minutes
.999999 six nines 32 seconds

If a business, due to holidays and weekends is only open 200 days per year, and its systems are only expected to operate 8 hours per business day, then it is somewhat easier. Then there are only 1600 hours per year that the system needs to be fully functional. It is possible to take the system down for maintenance and do repairs, for example, during off hours and not affect the availability rating. The downtime table becomes:

Availability Nines Expected downtime per year
.99 two nines 16 hours
.999 three nines 1.6 hours
.9999 four nines 9.6 minutes
.99999 five nines 58 seconds
.999999 six nines 5.8 seconds

For a business operating 8 hours per day, 200 days per year, it is easier and less expensive to design the system to have a very low downtime DURING BUSINESS HOURS, since maintenance and many repairs can be done during off hours.

Note that “Expected Downtime” is an average that must be taken over several years and over many similar systems. It is definitely NOT a maximum downtime! Typical “nines” ratings tend also not to include disasters such as fires, earthquakes, acts of war, etc. To design for these contingencies requires special consideration with backup and failover to remote sites.


Tags: , , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: