Greetings all, As Im sure all of you are aware, we had a severe - TopicsExpress



          

Greetings all, As Im sure all of you are aware, we had a severe network outage last night that impacted pretty much every technology service we have. While we are still piecing together what happened, what we know is this: 1) The first symptoms that weve been able to track down occurred shortly after 2AM this morning. 2) The primary core switch/router that connects servers to the rest of campus had one of its two power supplies without power when we arrived at 7AM this morning. 3) One of the groups of ports on that switch appeared to be powered down. 4) Our notification system that is supposed to send us text messages in the event of a failure didnt do so. We think we have the system restored at this point. If folks in your area are still experiencing issues, please have them reboot their computer (if they havent tried that since 8:30AM) and then call the help desk if they are still having problems. From what we can see, this appears to have been a cascading failure - a perfect storm if you will. The power supply that was off in the primary switch has a redundant unit that should handle the load and for some reason appears not to have. The primary switch has a secondary switch that is supposed to take over seamlessly and that didnt happen either. The notification system, which is designed to be largely isolated from other systems so that it functions even during a severe issue either had an independent problem at exactly the wrong time or was impacted by the big failure when it shouldnt have been. Rest assured that well be pouring through logs from many systems and engaging in forensic exercises to figure out exactly what happened so that we can design the system to be more resilient. Thank you for your patience and were sorry for the interruption to your work day. In other news, Happy Vernal Equinox! JD
Posted on: Thu, 20 Mar 2014 13:53:11 +0000

Trending Topics



Recently Viewed Topics




© 2015