Do You Still Use SCOM? Gartner predicts that spending on public cloud services will rise…
Is Office 365 down?
On Wednesday January 27th, 2021, Microsoft Office 365 experienced an outage affected a number of its services with a prolonged outage affecting Exchange Online.
Exchange Online Outage but Impacting Other Services Like SharePoint Online
Various details could be found through Microsoft’s feeds and status notifications. For example, @MSFT365Status twitter feed hosted in the CloudReady Dashboard alongside our own events looked like this:
Here’s more detail from the resultant ticket in Service Health:
Title: Users may have been unable to access email in Exchange Online
User Impact: Users may have been unable to access email in Exchange Online.
More info: The problem occurred from all Exchange Online connection methods. Users may have been unable to utilize calendaring functionality within other Microsoft 365 services reliant on Exchange Online connectivity, such as Microsoft Teams. Further, affected users may have also been unable to search for SharePoint Online content.
Final status: We’ve rolled back the configuration change and our telemetry indicates the service availability has been restored. We understand the serious business impact caused when your email doesn’t work as expected and we will provide updates on our next steps in the Post Incident Report within five business days. Scope of impact: Any user hosted on the affected infrastructure may have been unable to access email in Exchange Online.
Start time: Wednesday, January 27, 2021, 3:30 PM (8:30 PM UTC)
End time: Wednesday, January 27, 2021, 5:55 PM (10:55 PM UTC)
Root cause: A recently implemented configuration change intended to flush service cache under specific circumstances led to higher utilization of processing resources within the affected infrastructure, and caused the impact.
Next steps: – We’re reviewing our deployment and provisioning procedures to determine why impact to Exchange Online wasn’t caught prior to deployment and to help prevent similar problems in the future. We’ll publish a post-incident report within five business days.
Just Exchange Online?
Despite Microsoft indicating that it was just Exchange Online affected during this outage Exoprise’s Office 365 Monitoring detected that Azure Active Directory and dependent services like SharePoint and OneDrive were also affected at the time. The outage information indicated a rollout and rollback but we wouldn’t expect to see such a widescale outage and slowdown just affecting some of the schema unless everything had to be taken offline.
Early Outage Detection of Office 365
Despite Microsoft recording the start of the Microsoft 365 Outage Event at approximately 3:30, the Exoprise CloudReady platform started detecting poor AAD performance and issues far earlier than that.
Exoprise detected the slowdown, Azure Active Directory errors and problems more than 2 hours before Microsoft reported the problem. This one particular sensor was an Outlook Web App sensor but you can also see the Exoprise Crowd-sourced Monitoring starting to spike and be affected globally at the same time. This indicates and can be helpful in detecting outages that are not just in your tenant but across Microsoft.
Ask Your Vendors the Right Questions
Some monitoring vendors out there just post Microsoft Status screenshots but don’t show you their own product’s evidence of outage detection. Make sure you’re asking the vendor whether they can really detect any outages or is it just glorified wrapping of Microsoft’s twitter status feed. By the time Microsoft knows about the outage – that’s too late, your users have already been impacted. With Exoprise, you get early indicators and just as importantly you know when the outage has really been resolved.
Take a Free 15 Day Trial, it sets up in minutes.