Have you ever been chatting or gaming on
Discord when the server suddenly goes completely down? Completely stressful, am I right? I have been there, as a mod it can be really stressful. Of course, you can solve down time and pull off a container mod in complete control. Here is how I do it.
Look For Early Signs
Spotting problems early is the best approach.
Servers typically provide subtle alerts prior to crashing. Keep an eye on:
•
Bots behaving oddly – If there are sudden errors, it usually indicates a problem.
•
Slow server performance – If you happen to have more than one user trying to connect at the same time, your server will start to slow down.
•
Maintenance notifications – Discord will sometimes announce planned downtime.
If you pay attention to early signs, you will have the opportunity to reload or act. Have you ever seen the
“Server Status: Partial Outage” notification? That is a sign it is time to investigate.
Tell Others ASAP
When there is downtime, silence is the worst. Users will panic if they don't know what is going on. I always:
- Post a short update in a popular channel
- Keep message friendly and honest, ie "Hey everyone, we're having issues with the server. We are fixing it!"
- Use status bots or pinned messages for updates.
People value honesty. A simple message can go a long way.
Stay Calm!
Mods, please stay calm! Don't panic? Things will get worse. First off, here's what I do:
•
Delegate - Have one person check messages, one person check bots, and someone else can just monitor server status.
•
Have backups - If you have other bots, or another admin account, it will help when things go sideways more than once (and if you're in the right headspace to sort that).
•
Note your incidents - Write in moderation log what happened so that next time you don't have to rely on your wits to figure it out.
You shouldn't have to do everything by yourself - it's really useful to have a team to assist you to plan your moderation.
Learn For Next Time
After the server is back to normal, take time to decompress and review the incident:
- Was this issue preventable?
- Which tools or bots failed?
- How can we improve next time?
Keeping a log of your downtimes will aid in your response in the future.