Resultados 1 a 3 de 3
  1. #1
    WHT-BR Top Member
    Data de Ingresso
    Dec 2010

    [EN] "How do you build carrier-grade (99.999%) cloud infrastructure?"

    Short answer: You don’t.

    The only way to build a solution with more than 99.9% availability is to build an application-layer solution running in multiple availability zones.

    By Ivan Pepelnjak
    Tuesday, October 06, 2015

    During my recent SDN workshop one of the attendees asked me “How do you build carrier-grade (5 nines) cloud infrastructure [with VMware NSX]?”

    Before delving into details (aka disclaimer)

    This is not an NSX-related blog post. It just happened that the attendee tried to accomplish the Mission Impossible with NSX. He could have chosen Juniper Contrail or Nuage VSP or anything else while facing the same pointless task.

    The Problem

    I’ve encountered two compute infrastructure products that were probably close to what people call carrier-grade in my days – IBM mainframes and Tandem minicomputers. Both were incredibly complex and expensive, and ran short user-written transactions on top of fully redundant software and hardware infrastructure.

    It’s impossible to reproduce the same feat in an Infrastructure-as-a-Service cloud environment because the workload isn’t composed of short ACID transactions but of servers of unknown quality. You might be able to build a cloud infrastructure with 5-nine reliability, but it would be a totally wasted effort if the workload running on top of it crashes (or is brought down for patching). See also High Availability Fallacies for more details.

    The only way to build a solution with more than 99.9% availability is (according to James Hamilton) to build an application-layer solution running in multiple availability zones, and once you do that, you don’t care that much about the availability of individual zones as long as it’s reasonably high.

    Building Carrier-Grade Infrastructure

    Twenty-five years ago we had simple routers and switches, and we knew how to build resilient networks with redundant boxes and routing protocols. Then the traditional service providers learned how to spell IP and wanted to implement their existing operational practices in this brave new world… prompting the networking vendors to build increasingly complex infrastructure products like redundant supervisors, non-stop forwarding, and in-service software upgrade.

    Guess what – complex products tend to be expensive to build and operate. The carriers complaining about high cost of the networking gear and lustfully looking at what Google, Facebook, Amazon and Azure are doing should stop yammering and admit that they got what they asked for.
    Randy Bush talked about this problem more than a decade ago, but of course nobody listened.

    Obviously some people never learn, and now that the carriers turn their attention toward the new fad – Network Function Virtualization – they want to repeat the same mistake, and want cloud architects to build carrier-grade infrastructure on which they’ll run unreliable workloads.
    Insanity: doing the same thing over and over again and expecting different results.
    Definitely not Einstein

    The Way Forward

    The more I look at what various organizations are doing (and succeeding or failing along the way), the more I’m convinced that there’s only way to reduce the overall costs of running your IT infrastructure:

    • Set realistic goals based on actual business needs;
    • Build good enough infrastructure that is easy to operate at reasonable costs;
    • Build the few applications that actually need very high availability (not everything needs five nines) using modern design-for-failure architectural principles. See also Cloud Native Applications for Dummies.

    Numerous large-scale companies have proven that this approach works, but of course it requires a major change in the way your company develops and deploy applications.

    You could also decide to ignore this trend and continue building ever more complex infrastructure, and get the results you deserve.
    Última edição por 5ms; 07-10-2015 às 21:19.

  2. #2
    WHT-BR Top Member
    Data de Ingresso
    Dec 2010

    What exactly makes something “mission critical”?

    Pete Welcher wrote an excellent Data Center L2 Interconnect and Failover article with a great analogy: he compares layer-2 data center interconnect to beer (one might be a good thing, but it rarely stops there). He also raised an extremely good point: while it makes sense to promote load balancers and scale-out architectures, many existing applications will never run on more than a single server (sometimes using embedded database like SQL Express).

    He’s right ... but then you have to ask your CIO what exactly makes something “mission critical” and why it would be necessary to implement L2 DCI and risk the stability of a VLAN (or even a whole data center, depending on how bad your design is) just to increase the uptime of such a brittle kludge by transporting it between data centers intact (while it’s quite probable that it will implode on its own without being touched or moved). It might make more sense to be pragmatic, acknowledge that some applications will never be highly reliable and live with the consequences.

  3. #3
    Data de Ingresso
    Oct 2010
    Rio de Janeiro
    O problema é que ser pragmático não está no dicionário dos C-levels e gerentes de hoje em dia. Eles são vendidos uma visão de que tudo funciona mil maravilhas.

Permissões de Postagem

  • Você não pode iniciar novos tópicos
  • Você não pode enviar respostas
  • Você não pode enviar anexos
  • Você não pode editar suas mensagens