On September 26th 2011, ProsePoint Express suffered its first major unscheduled outage.
An intermittently faulty disk caused the service to go down for 30 minutes. In the scramble that followed, we quickly restored the service without really knowing the root cause. After more investigation, we decided that ProsePoint Express couldn't continue operating uninterrupted with any degree of confidence. It was decided to conduct emergency maintenance to replace the faulty disk (Actually, we replaced the server containing the faulty disk, but the effect is the same). That resulted in an additional 50 minutes of unscheduled outage.
After the emergency maintenance, ProsePoint Express was restored back to normal.
This outage was a rude shock.