Page 1 of 1

Analysis of our recent downtime:

Posted: March 15th, 2013, 3:26 pm
by robotman
Since the nature of Scioly.org is sharing information, I thought that it would be the correct response to explain the cuases of our recent downtime, as well as the step taken to fix the problems.
March 6th:
4 PM:
It started with myself starting a backup of the wiki and forum databases. The attempt to backup the databases ended with an error, regarding MYSQL. Due to this (and the fact it seemed I had a MYSQL database sitting in the update queue) I rand a full server update. This however was not my smartest move, the update caused MYSQL to crash, and not be revivable, thus the board and the wiki were both down.
Once I realized that MYSQL had crashed, I backed up the databases, and the necessary files (necessary being tests, and images). With backups complete (they took the most time of this whole process) I then started work on the mysql database, attempts at installation failed due to a package error in Ubuntu-Server Edition. Attempts at installation via separate packages failed due to running out of disk space errors. At this point I decided that way to return scioly.org as quickly as possible was to wipe our server and start from scratch.
At 9PM CST I started that process (yes backups took that long, halfway through I realized there was a more efficient method, however I was already committed). I started with restoration of the board. I installed phpBB from scratch, and then dealt with uploading content. By 12:03AM I had restored the board to working order, then focused on getting the wiki back in shape. I updated our Mediawiki install to the latest version (That will come in later) and had it back up with most of its content by 2:15 AM.

March 7th:
At 2:15 AM I had fixed the wiki, minus theme, and tests and hit the sack in order to make it to school. At 7am I started uploading the backed up tests to the wiki, which should have been complete midday.

March 8th to March 15th:
The server was mostly up, however due to a slight misconfiguration of apache, with enough load the server would crash, Due to being on vacation, I had limited access to internet, and manage to fix the server crashing I believe on Sunday the 10th. or monday the 11th.
However I had to leave emails down, as well as wiki login.

March 16th:
I restored email functionally, and resetup the administrator emails.
I fixed (and improved) wiki login, it is now a Single Sign In for both wiki and forum.
I have been working on formatting issues with the wiki templates.

If you have any questions (or complaints) feel free to ask(or inform me).

Re: Analysis of our recent downtime:

Posted: March 15th, 2013, 3:31 pm
by mnstrviola
robotman wrote: I fixed (and improved) wiki login, it is now a Single Sign In for both wiki and forum.
Yay! This is much more convenient :)

Re: Analysis of our recent downtime:

Posted: March 18th, 2013, 10:53 am
by twototwenty
The wiki is still giving me issues: the single sign-in is not working at all for me, and when I try to log in on the wiki, I get this:
No such special page
You have requested an invalid special page.

A list of valid special pages can be found at Special pages.

Return to Main Page.

Re: Analysis of our recent downtime:

Posted: March 18th, 2013, 11:33 am
by Luo
twototwenty wrote:The wiki is still giving me issues: the single sign-in is not working at all for me, and when I try to log in on the wiki, I get this:
No such special page
You have requested an invalid special page.

A list of valid special pages can be found at Special pages.

Return to Main Page.
This was confusing to me at first too (I encountered the same thing), but simply don't try to log in on the wiki at all. If you're logged in on the forums and you try to edit a wiki page, it should work. Let me know if it doesn't.

Re: Analysis of our recent downtime:

Posted: March 19th, 2013, 9:09 pm
by astro124
I'm sorry but I don't really understand a lot of computer stuff (especially programming).

What is 'MYSQL'?

Re: Analysis of our recent downtime:

Posted: March 19th, 2013, 10:07 pm
by iwonder
It's a database engine, basically it's the program that stores all the posts and wiki pages and things of that nature.

Re: Analysis of our recent downtime:

Posted: March 20th, 2013, 10:16 am
by Infinity Flat
Just FYI, some of the tests on the test exchange are still unavailable.

Re: Analysis of our recent downtime:

Posted: March 20th, 2013, 11:01 am
by foreverphysics
Which ones?

Re: Analysis of our recent downtime:

Posted: March 20th, 2013, 11:19 am
by Infinity Flat
foreverphysics wrote:Which ones?
This was just someone pointed out to me last night - the only ones I'm aware of at the moment are for experimental design: the Conestoga 2012 and Wright State Invitational