• SpikesOtherDog@ani.social
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    1
    ·
    5 months ago

    Yes. A worldwide service provider should be able to achieve at least 4 9s of uptime. That’s 99.99% available, or about <52 minutes of downtime a year. That’s accomplished through best practices with redundancy, planned maintenance, and solid disaster recovery plans.

    The ways to achieve a disaster of this magnitude include:

    • No hot spares
      • A security event has locked all redundant servers and they are now rebuilding servers from backup.
    • Lack of effective redundancy
      • A disaster has occurred at one data center and the load sharing is causing the servers to be unresponsive
        • This is unlikely because there would be intermittent reports of success
    • Poor patching management
      • Patches were sent to all servers without proper testing or rollback strategy