• Jackthelad@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    4
    ·
    5 months ago

    This hardly ever happens anymore. I remember the days of the PS3 when it felt like a weekly occurrence.

    • Domi@lemmy.secnd.me
      link
      fedilink
      English
      arrow-up
      16
      arrow-down
      1
      ·
      5 months ago

      Weren’t they down for ~7 hours just last year?

      Not saying it happens often but having a downtime that long is unprofessional for a company that size.

      • SpikesOtherDog@ani.social
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        1
        ·
        5 months ago

        Yes. A worldwide service provider should be able to achieve at least 4 9s of uptime. That’s 99.99% available, or about <52 minutes of downtime a year. That’s accomplished through best practices with redundancy, planned maintenance, and solid disaster recovery plans.

        The ways to achieve a disaster of this magnitude include:

        • No hot spares
          • A security event has locked all redundant servers and they are now rebuilding servers from backup.
        • Lack of effective redundancy
          • A disaster has occurred at one data center and the load sharing is causing the servers to be unresponsive
            • This is unlikely because there would be intermittent reports of success
        • Poor patching management
          • Patches were sent to all servers without proper testing or rollback strategy