25-02-2013, 09:17 #1
MS pisa na bola e Azure fica 10h fora do ar
"Overall, this outage yet again underscores the importance of monitoring. No cloud provider will have a perfect up-time record. Thus, the faster you know that something is wrong, the faster you can react with a contingency plan."
Consideram tão importante a monitoração que levaram "apenas" 10h para "reagirem" com um "plano de contigência".
"Even if you do not require all the sophisticated features of AzureWatch, we suggest that you at the minimum use our simple but effective free monitoring utility AzurePing that can send you alerts when it is unable to access your Azure resources."
Leia-se: se sair do ar, o problema é seu.
Windows Azure Storage experienced a worldwide outage impacting HTTPS traffic due to an expired SSL certificate. HTTP traffic was unaffected but the event impacted a number of Windows Azure services that are dependent on Storage. We executed the repair steps to update the SSL certificate on the impacted clusters and availability was restored to >99% worldwide by 1:00 AM PST on February 23. At 8:00 PM PST on February 23, we completed the restoration effort and confirmed full availability worldwide. Given the scope of the outage, we will proactively provide credits to impacted customers in accordance with our SLA. The credit will be reflected on a subsequent invoice. Our teams are also working hard on a full root cause analysis (RCA), including steps to help prevent any future reoccurrence. The RCA will be posted on this blog as soon as it is available. We sincerely apologize for the interruption and any issues it has caused.
General ManagerAs you are probably aware, Windows Azure suffered a world-wide Azure Storage outage on Friday 02/22/2013. This outage was caused by an expired Microsoft SSL certificate. The outage impacted Azure Storage, Azure Websites, Service Bus, Media Services, ACS, and Azure Management Portal and lasted for approximately ten hours.
AzureWatch monitors remained active during the outage. Customers who monitored their storage accounts with "Alert on Failure" option turned on, began receiving alerts at approximately 8:32PM UTC (well in advance of any outage notice on Microsoft's Azure Dashboard).
It is important to point out that AzureWatch's monitoring was impacted because key metrics located within our customers' Azure deployments were inaccessible. Furthermore, AzureWatch Management Portal was unavailable because it currently relies on Azure Storage. To help mitigate our outage, our engineers were available via recently implemented online chat interface to provide extra support to customers logging into our portal.
Overall, this outage yet again underscores the importance of monitoring. No cloud provider will have a perfect up-time record. Thus, the faster you know that something is wrong, the faster you can react with a contingency plan. Even if you do not require all the sophisticated features of AzureWatch, we suggest that you at the minimum use our simple but effective free monitoring utility AzurePing that can send you alerts when it is unable to access your Azure resources.
Thank you for your continued support and interest in our service!
Última edição por 5ms; 25-02-2013 às 09:25.
25-02-2013, 10:47 #2
Se o problema foi causado por um certificado SSL expirado, suponho que todo ano o serviço deve ficar offline por algumas horas huahuahauau
25-02-2013, 10:54 #3
- Data de Ingresso
- Mar 2011
Pelo menos é um dowtime programado, e com um ano de antecedência ninguem pode dizer que não foi avisado a tempo.
25-02-2013, 17:30 #4
O problema é a perda de confiabilidade que esse tipo de episódio de grande fornecedor causa no mercado como um todo.
25-02-2013, 22:45 #5
- Data de Ingresso
- Oct 2010
- Rio de Janeiro
Certificados podem ser renovados por 10 anos...