Sign in to follow this  
Keenan

Celebration outage - Postmortem

Recommended Posts

Around 11am CEST the Celebration server experienced an outage.

 

The cause:

MySQL 8 has a default of 30 days retention on binary log files, which in our case are of a substantial size. They filled up the data drive that stores Celebration's game data and resulted in errors from both the map saving and the database. The server seems to have shutdown cleanly as soon as this occurred.

 

The resolution:

We've reduced the default and increased the size of the drive. We do not need 30 day retention of binary log files due to our server backup method.

 

Other notes:

Max was unable to restore services due to the nature of the outage and I was unable to be reached prior to 8am Eastern US time. I will be working with Max on a direct communication method that will cut through my phone's Do Not Disturb time so that I can be reached.

 

We enabled sleep bonus for Celebration only upon the restart due to the duration of the outage.

  • Like 10

Share this post


Link to post
Share on other sites

I have to confess I tutted a few times. But as I had completely run out of sleep bonus and now have 5 hours, I withdraw all my earlier tuts.

  • Like 5

Share this post


Link to post
Share on other sites

Clearly @Keenanshouldnt sleep. Must be available at all hours of the day to attend to our needs.

 

Take away his sleep powders!!!

Edited by Angelklaine
  • Like 2

Share this post


Link to post
Share on other sites

Wouldn't we need to give him more "sleep powder" so he doesn't need to sleep?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this