Analysis of our recent downtime:
Posted: March 15th, 2013, 3:26 pm
Since the nature of Scioly.org is sharing information, I thought that it would be the correct response to explain the cuases of our recent downtime, as well as the step taken to fix the problems.
March 6th:
4 PM:
It started with myself starting a backup of the wiki and forum databases. The attempt to backup the databases ended with an error, regarding MYSQL. Due to this (and the fact it seemed I had a MYSQL database sitting in the update queue) I rand a full server update. This however was not my smartest move, the update caused MYSQL to crash, and not be revivable, thus the board and the wiki were both down.
Once I realized that MYSQL had crashed, I backed up the databases, and the necessary files (necessary being tests, and images). With backups complete (they took the most time of this whole process) I then started work on the mysql database, attempts at installation failed due to a package error in Ubuntu-Server Edition. Attempts at installation via separate packages failed due to running out of disk space errors. At this point I decided that way to return scioly.org as quickly as possible was to wipe our server and start from scratch.
At 9PM CST I started that process (yes backups took that long, halfway through I realized there was a more efficient method, however I was already committed). I started with restoration of the board. I installed phpBB from scratch, and then dealt with uploading content. By 12:03AM I had restored the board to working order, then focused on getting the wiki back in shape. I updated our Mediawiki install to the latest version (That will come in later) and had it back up with most of its content by 2:15 AM.
March 7th:
At 2:15 AM I had fixed the wiki, minus theme, and tests and hit the sack in order to make it to school. At 7am I started uploading the backed up tests to the wiki, which should have been complete midday.
March 8th to March 15th:
The server was mostly up, however due to a slight misconfiguration of apache, with enough load the server would crash, Due to being on vacation, I had limited access to internet, and manage to fix the server crashing I believe on Sunday the 10th. or monday the 11th.
However I had to leave emails down, as well as wiki login.
March 16th:
I restored email functionally, and resetup the administrator emails.
I fixed (and improved) wiki login, it is now a Single Sign In for both wiki and forum.
I have been working on formatting issues with the wiki templates.
If you have any questions (or complaints) feel free to ask(or inform me).
March 6th:
4 PM:
It started with myself starting a backup of the wiki and forum databases. The attempt to backup the databases ended with an error, regarding MYSQL. Due to this (and the fact it seemed I had a MYSQL database sitting in the update queue) I rand a full server update. This however was not my smartest move, the update caused MYSQL to crash, and not be revivable, thus the board and the wiki were both down.
Once I realized that MYSQL had crashed, I backed up the databases, and the necessary files (necessary being tests, and images). With backups complete (they took the most time of this whole process) I then started work on the mysql database, attempts at installation failed due to a package error in Ubuntu-Server Edition. Attempts at installation via separate packages failed due to running out of disk space errors. At this point I decided that way to return scioly.org as quickly as possible was to wipe our server and start from scratch.
At 9PM CST I started that process (yes backups took that long, halfway through I realized there was a more efficient method, however I was already committed). I started with restoration of the board. I installed phpBB from scratch, and then dealt with uploading content. By 12:03AM I had restored the board to working order, then focused on getting the wiki back in shape. I updated our Mediawiki install to the latest version (That will come in later) and had it back up with most of its content by 2:15 AM.
March 7th:
At 2:15 AM I had fixed the wiki, minus theme, and tests and hit the sack in order to make it to school. At 7am I started uploading the backed up tests to the wiki, which should have been complete midday.
March 8th to March 15th:
The server was mostly up, however due to a slight misconfiguration of apache, with enough load the server would crash, Due to being on vacation, I had limited access to internet, and manage to fix the server crashing I believe on Sunday the 10th. or monday the 11th.
However I had to leave emails down, as well as wiki login.
March 16th:
I restored email functionally, and resetup the administrator emails.
I fixed (and improved) wiki login, it is now a Single Sign In for both wiki and forum.
I have been working on formatting issues with the wiki templates.
If you have any questions (or complaints) feel free to ask(or inform me).