Files1 Hard Disk Maintenance -- 8-12 December 2013 -- Done -- The files1 server has a failing hard drive that needs to be replaced and is currently causing minor random lag issues on the wiki. To do this the Wiki uploads must be temporarily disabled during this time. No other service interruption is expected. This maintenance has been successfully completed. Note that there were some intermittent issues during the maintenance period for logged in users. This maintenance has been successfully completed (both hard drives in the server were replaced).
There are a variety of methods that can be used to monitor the status of the servers making up the UESP.
- monitor.uesp.net is the web interface for the Zabbix monitoring software shows the basic status of all servers.
- content1 Apache status, content2 Apache status, and content3 Apache status show the Apache status for the content servers.
- files1 Lighttpd status show the Lighttpd status for the content servers.
The Zabbix monitoring can e-mail warnings and notifications about possible issues with any of the UESP servers. If you would like your e-mail to be included in the notification list please contact Daveh. Text access can be also be supplied if your cell phone provider as a free "e-mail to text" service.
In Case of Site Issues
In case of site issues including page display, lag and service outages
- Minor issues can be brought up on the Administrator Noticeboard or directly with a Wiki Admin if appropriate.
- Daveh (email@example.com) is the main contact for most site hardware and software issues.
- Other Wiki Admins can also be contacted if Daveh is unavailable but have limited access to fix anything.
- Files1 Hard Disk Maintenance -- 7-8 November 2013 -- Done -- The files1 server has a failing hard drive that needs to be replaced. To do this the Wiki uploads must be temporarily disabled during this time. No other service interruption is expected. This maintenance has been successfully completed. Note that there were some intermittent issues during the maintenance period for logged in users.
- 4 March 2012 -- Database Hard Drive Issue -- Done -- Another hard drive issue on db1 will result in a short period of read-only (<5 min) for the wiki and forums sometime today while we switch the primary database to db2. No service interruption is expected.
- Database Hardware Replacement -- 1 February 2013 -- Done -- A hard drive is failing on our master database server so the wiki/forums will be locked for a few minutes sometime today while we switch servers. No downtime or service interruption is expected. The hard drive has been replaced and there will be another short period of lock time when we switch back to db1 in the next day or two.
- Database Issue -- 7:00-8:30 EST 9 December 2012 -- Resolved -- More intermittent load issues due to a misbehaving backup process on db1.
- Database Issue -- 7:30-8:30 EST 2 December 2012 -- Resolved -- The drive on the primary database filled up after the last full backup this morning which lead to various site outages. Free space on the drive has been restored which resolved the issue.
- Wiki Upgrade -- 7-10:00 EST 3 September 2012 -- Done -- The Wiki will be set to read-only (no edits permitted) during most of this time in order to perform a long overdue upgrade of the MediaWiki from 1.14 to 1.19. No service interruption is expected. The upgrade was completed successfully and barring a few minor display issues everything is back to normal.
- Wiki Database Restore -- 18-20:00 21 August 2012 -- The wiki database had to be restored which resulted in 1-2 hours of downtime for logged in users, some intermittent errors, and a lost half-day of edits.
- 20:00 EST 31 January 2012 -- Upcoming Planned Maintenance (Completed) -- One Drive on db2 needs replacing. Currently waiting for db1 to be ready for live use before switching back over to db1 for writes. There will be a brief period of read-only access to the forums and wiki when the switch is made.
- 20:00 EST 13 December 2011 -- Planned Maintenance Completed -- All apps will be switched to using db2 as a read/write database in order to perform repairs on db1. There will be a minimal period (~5 minutes) of things being in read-only mode. Everything should be now running completely off of db2.
- 17:00 EST 9 December 2011, Forums Resolved -- Forums were partially inaccessible due to a corrupt database table. Table has been repaired and forums should be mostly operable again.
- 16:05 EST 2 December 2011, db2 Resolved -- 5 minute downtime due to db2 becoming unavailable. Issue temporarily bypassed and being looked into at the moment. Motherboard on server failed and was replaced. Content servers switched back to using db2 for wiki reads.
- Db1: 20 November 2011 -- One drive on the RAID-1 array failed and is awaiting replacement.
- Firewall: 11-18 November 2011 -- The firewall was hitting its connection limit of 10,000 due to the increased traffic from Skyrim's release. Fixed by both upgrading the firewall and using a temporary squid2 server outside of the cluster.
- Files1: 5 November 2011 -- Drive failed in a RAID array which took the server down. Drive replaced and server back up.
- Content3: 8 November 2010 -- content3 taken off-line to fix a random data corruption issue. Only affect will be no UESP mail and the minor dave.uesp.net website. This does not affect registration e-mails sent by the UESP Wiki/Forums.
- All Sites: 5-8 November 2010 -- Register.com had issues which caused intermittent DNS requests for the UESP domains to fail. The effect was small and overall traffic was interrupted by less than 10% over the weekend.
- All Sites: 3 November 2010 -- iWeb had a major power failure that took out several servers at midnight for around an hour. All servers except content3 came back up when power was restored.
- Squid1: 27 Feb 2010, 21:30 EST -- Hard drive replaced and service restored. Waiting for DNS entries to update.