23-01-2016, 06:52 #1
[EN] Why Google Won’t Support Docker’s Container Network ModelGoogle jumped into the container management game last year with the release of Kubernetes v1.0 last July. Kubernetes [koo-bur-net-eeze], is “an open source container cluster manager [that] aims to provide a platform for automating deployment, scaling, and operations of application containers across clusters of hosts.”
Along with the Kubernetes release, Google announced its partnership with the Linux Foundation to form the Cloud Native Computing Foundation (CNCF) and offered Kubernetes as a seed technology.
So, then, why is Google’s Kubernetes shunning Docker’s Container Network Model (CNM), Docker’s approach to providing container networking with support for multiple network drivers.
After all,” wrote Tim Hockin, Software Engineer at Google, “vendors will almost certainly be writing plugins for Docker — we would all be better off using the same drivers, right?”
Well, Google would, but it turns out that there are fundamental differences in the design of Docker’s network drivers that don’t play well with Kubernetes.
Hockin breaks down the multiple reasons in detail in a recent blog post. The short version: At their core, the two systems are different in structure, making it complicated to get them to talk to each other. Which is the antithesis of the container’s philosophy of ease of use?
“Kubernetes supports multiple container runtimes, of which Docker is just one,” Hockin explained, “Configuring networking is a facet of each runtime, so when people ask “will Kubernetes support CCM?” what they really mean is “will Kubernetes support CNM drivers with the Docker runtime?”
Google engineers worked with the libkv interface used by Docker but found libkv wasn’t compatible with the structure of Kubernetes.
Hockin discusses several technical solutions for solving these problems, but none was found to solve the issues in a satisfying way. They even tried writing plugins for Dockers plugins to get the platforms to work together, but there is too much of a philosophical difference between the Kubrenetes and Docker’s underlying structure to make this workable. At the end of the day, why shoehorn a multi-container runtime solution into a single runtime solution?
Google engineers have, instead, focused on CoreOS’s Container Network Interface (CNI) model and part of the App Container (appc) specification.
There will be some unfortunate side-effects of this,” acknowledged Hockin. “In particular, containers started by Docker run might not be able to communicate with containers started by Kubernetes, and network integrators will have to provide CNI drivers if they want to fully integrate with Kubernetes.
On the upside, focusing on the multi-container runtime solution, Kubernetes will enable the engineers to streamline the system, making it more flexible and simpler to use.
Are you frustrated by Google’s lack of compatibility with Docker? Google engineers are continually looking for ways to integrate and simplify Kubernetes.
“If you have thoughts on how we can do that,” wrote Hockin, “we really would like to hear [your ideas] — find us on slack or on our network SIG mailing list.”
23-01-2016, 06:55 #2
Why Kubernetes doesn’t use libnetworkTim Hockin | Google
Thursday, January 14, 2016
Kubernetes has had a very basic form of network plugins since before version 1.0 was released — around the same time as Docker's libnetwork and Container Network Model (CNM) was introduced. Unlike libnetwork, the Kubernetes plugin system still retains its "alpha" designation. Now that Docker's network plugin support is released and supported, an obvious question we get is why Kubernetes has not adopted it yet. After all, vendors will almost certainly be writing plugins for Docker — we would all be better off using the same drivers, right?
Before going further, it's important to remember that Kubernetes is a system that supports multiple container runtimes, of which Docker is just one. Configuring networking is a facet of each runtime, so when people ask "will Kubernetes support CNM?" what they really mean is "will kubernetes support CNM drivers with the Docker runtime?" It would be great if we could achieve common network support across runtimes, but that’s not an explicit goal.
Indeed, Kubernetes has not adopted CNM/libnetwork for the Docker runtime. In fact, we’ve been investigating the alternative Container Network Interface (CNI) model put forth by CoreOS and part of the App Container (appc) specification. Why? There are a number of reasons, both technical and non-technical.
First and foremost, there are some fundamental assumptions in the design of Docker's network drivers that cause problems for us.
Docker has a concept of "local" and "global" drivers. Local drivers (such as "bridge") are machine-centric and don’t do any cross-node coordination. Global drivers (such as "overlay") rely on libkv (a key-value store abstraction) to coordinate across machines. This key-value store is a another plugin interface, and is very low-level (keys and values, no semantic meaning). To run something like Docker's overlay driver in a Kubernetes cluster, we would either need cluster admins to run a whole different instance of consul, etcd or zookeeper (see multi-host networking), or else we would have to provide our own libkv implementation that was backed by Kubernetes.
The latter sounds attractive, and we tried to implement it, but the libkv interface is very low-level, and the schema is defined internally to Docker. We would have to either directly expose our underlying key-value store or else offer key-value semantics (on top of our structured API which is itself implemented on a key-value system). Neither of those are very attractive for performance, scalability and security reasons. The net result is that the whole system would significantly be more complicated, when the goal of using Docker networking is to simplify things.
For users that are willing and able to run the requisite infrastructure to satisfy Docker global drivers and to configure Docker themselves, Docker networking should "just work." Kubernetes will not get in the way of such a setup, and no matter what direction the project goes, that option should be available. For default installations, though, the practical conclusion is that this is an undue burden on users and we therefore cannot use Docker's global drivers (including "overlay"), which eliminates a lot of the value of using Docker's plugins at all.
Docker's networking model makes a lot of assumptions that aren’t valid for Kubernetes. In docker versions 1.8 and 1.9, it includes a fundamentally flawed implementation of "discovery" that results in corrupted /etc/hosts files in containers (docker #17190) — and this cannot be easily turned off. In version 1.10 Docker is planning to bundle a new DNS server, and it’s unclear whether this will be able to be turned off. Container-level naming is not the right abstraction for Kubernetes — we already have our own concepts of service naming, discovery, and binding, and we already have our own DNS schema and server (based on the well-established SkyDNS). The bundled solutions are not sufficient for our needs but are not disableable.
Orthogonal to the local/global split, Docker has both in-process and out-of-process ("remote") plugins. We investigated whether we could bypass libnetwork (and thereby skip the issues above) and drive Docker remote plugins directly. Unfortunately, this would mean that we could not use any of the Docker in-process plugins, "bridge" and "overlay" in particular, which again eliminates much of the utility of libnetwork.
On the other hand, CNI is more philosophically aligned with Kubernetes. It's far simpler than CNM, doesn't require daemons, and is at least plausibly cross-platform (CoreOS’s rkt container runtime supports it). Being cross-platform means that there is a chance to enable network configurations which will work the same across runtimes (e.g. Docker, Rocket, Hyper). It follows the UNIX philosophy of doing one thing well.
Additionally, it's trivial to wrap a CNI plugin and produce a more customized CNI plugin — it can be done with a simple shell script. CNM is much more complex in this regard. This makes CNI an attractive option for rapid development and iteration. Early prototypes have proven that it's possible to eject almost 100% of the currently hard-coded network logic in kubelet into a plugin.
We investigated writing a "bridge" CNM driver for Docker that ran CNI drivers. This turned out to be very complicated. First, the CNM and CNI models are very different, so none of the "methods" lined up. We still have the global vs. local and key-value issues discussed above. Assuming this driver would declare itself local, we have to get info about logical networks from Kubernetes.
Unfortunately, Docker drivers are hard to map to other control planes like Kubernetes. Specifically, drivers are not told the name of the network to which a container is being attached — just an ID that Docker allocates internally. This makes it hard for a driver to map back to any concept of network that exists in another system.
This and other issues have been brought up to Docker developers by network vendors, and are usually closed as "working as intended" (libnetwork #139, libnetwork #486, libnetwork #514, libnetwork #865, docker #18864), even though they make non-Docker third-party systems more difficult to integrate with. Throughout this investigation Docker has made it clear that they’re not very open to ideas that deviate from their current course or that delegate control. This is very worrisome to us, since Kubernetes complements Docker and adds so much functionality, but exists outside of Docker itself.
For all of these reasons we have chosen to invest in CNI as the Kubernetes plugin model. There will be some unfortunate side-effects of this. Most of them are relatively minor (for example, docker inspect will not show an IP address), but some are significant. In particular, containers started by docker run might not be able to communicate with containers started by Kubernetes, and network integrators will have to provide CNI drivers if they want to fully integrate with Kubernetes. On the other hand, Kubernetes will get simpler and more flexible, and a lot of the ugliness of early bootstrapping (such as configuring Docker to use our bridge) will go away.
As we proceed down this path, we’ll certainly keep our eyes and ears open for better ways to integrate and simplify. If you have thoughts on how we can do that, we really would like to hear them — find us on slack or on our network SIG mailing-list.
23-01-2016, 07:33 #3
Docker buys Unikernel
Wannabe a kernel developer? Well, soon you can be and rather easily.
21 Jan 2016
Linux container biz Docker has bought Unikernel Systems, a startup in Cambridge, UK, that's doing interesting things with roll-your-own operating systems.
Rather than build an application on top of an OS, with the unikernel approach, you build your own tiny operating system customized for your application.
It's quite a coup for San Francisco-based Docker, as Unikernel Systems is made up of former developers of the Xen hypervisor project – the software that's used all over the world to run virtual machines in public clouds and private systems.
If you check through Unikernel System's staff on LinkedIn, you'll find folks like CTO Anil Madhavapeddy, David Scott, Thomas Gazagnaire and Amir Chaudhry, who have worked on, or are closely linked to, the development of open-source Xen. The team knows a thing or two about running apps in little boxes.
Why is Unikernel's work interesting? Well, let's remember what Docker is: it's a surprisingly easy-to-use tool that lets developers and testers package applications, and the bits and pieces those apps need to run, into neat and tidy containers that are separate from other containers.
On Linux, it uses cgroups and namespaces, among other mechanisms, in the kernel to keep containers isolated. So if you want to stand up a web server with a particular configuration, build (or download) that container and start it. Likewise if you need a particular toolchain: build, or find an image of, the container that has the necessary compilers, libraries and other dependencies. You don't have to worry about the dependencies and processes in one container interfering with another's; each box is kept separate.
Boxed in ... Three containers running on one kernel,
each containing their own apps and dependencies
All the containers on a machine share the same underlying Linux kernel – or the Windows kernel if you want to run Docker on Microsoft's operating system. Docker tries to be easier-to-use and more efficient than building, starting, and tearing down, whole virtual machines that each have their own kernels and full-fat operating systems.
Each Docker container not only has its own process tree but its own file system built up in layers from a base. The container has just the software it needs to perform a particular task, and no more. Thus, these boxes are supposed to be far more lightweight than virtual machines.
Unikernel Systems takes that streamlining one step further.
Heard of a kernel, but what's a unikernel?
Unikernels or library operating systems have been lurking in the corridors and labs of university computer science departments for roughly twenty years. Network hardware vendors have adopted them for their firmware to provide reliable and specialized services.
How do unikernels work? Take all that we've just said about a container – its processes, dependencies, file system, the underlying monolithic host kernel and its drivers – and compress it into one single space, as if it were a single program.
Confused? Let's fix that. Take your typical Linux machine. Open a terminal and run ps aux and get a list of all the processes running on the computer. Each of those processes, a running program, has its own virtual address space, in which its own code, its variables and other data, and its threads, exist. In this space, shared libraries and files can be mapped in. Right at the top of this virtual space, in an area inaccessible to the program, is the kernel, which appears at the top of all processes like God sitting on a cloud, peering down on the mere mortals below.
If a process wants the kernel to do anything for it, it has to make a system call, which switches the processor from executing code in the program to executing code in the privileged kernel. When the kernel has carried out the requested task, the processor switches back to the program.
Let's say a process wants to send some data over a network. It prepares the information to transmit, and makes the necessary system call. The processor switches to the kernel, which passes the data to its TCP/IP code, which funnels the data in packets to the Ethernet driver, which puts frames of the data out onto the physical wire. Eventually, the processor switches back to the program.
A unikernel smashes all of this together: the kernel just ends up being another library, or a set of libraries, compiled into the application. The resulting bundle sits in the same completely accessible address space – the kernel isn't sectioned off in its own protected bubble.
When the application wants to, say, send some data over the network, it doesn't fire off a system call to the kernel to do that. No context switch occurs between the process and the kernel. Instead, the program calls a networking library function, which does all the work necessary to prepare the data for transmission and makes a hyper call to an underlying hypervisor – which wrangles the network hardware into sending the data on the wire.
The unikernel model yanks parts of the kernel a program needs from the kernel's traditional protected space into the program's virtual address space, and shoves the hardware-specific work onto the hypervisor's desk. It is the ultimate conclusion of paravirtualization, a high-performance model of virtualization.
Old versus new ... Traditional monolithic kernel design and containers, left, and unikernel apps on a hypervisor or bare-metal
As illustrated above, the kernel's functionality – the green blocks – is moved from a monolithic base underlying each container to a sliver of libraries built into applications running on top of a hypvervisor.
This means unikernel apps do not have to switch contexts between user and kernel mode to perform actions; context switching is a relatively expensive procedure in terms of CPU clock cycles, and unikernels do away with this overhead and are therefore expected to be lean and mean.
The apps are also extremely lightweight as only the kernel functionality needed by the software is compiled in, and everything else is discarded. Thus the apps can be started and stopped extremely quickly – servers can be up and running as soon as requests come in. This model may even reduce the number of security vulnerabilities present in the software: less code means fewer bugs.
If you want to run a traditional POSIX application, you can using what's called a rump kernel. This provides just enough of a normal operating system to run a Unix-flavored application unmodified. One example is rumprun, which can run loads of packages from PHP to Nginx in a unikernel setting. Antti Kantee has been doing a lot of work on rump kernels, as has Justin Cormack, an engineer at Unikernel Systems.
One thing to note is that a hypervisor isn't always required: a rump kernel – think of it as a kernel-as-a-service – can provide drivers for hardware, allowing the unikernel app to run right on the bare metal.
Alternatively, driver libraries could be built into the apps so they can talk to the hardware directly, which is especially useful for embedded engineering projects, aka Internet-of-Things gadgets. Unikernel Systems' software can run on bare-metal ARM-compatible processors, and on systems without memory management units – the sorts of hardware you'll find in tiny IoT gear.
One tricky aspect to all of this, depending on your point of view, is the requirement to trust the unikernel applications. Rather than have an underlying kernel to keep individual processes, containers, and virtual machines in line, the unikernel apps have to share the machine with no one in overall control. There are ways to use the processor's built-in mechanisms – such as page tables and the memory management unit – to keep them isolated. Building a model to keep them in check will be something Unikernel Systems and Docker will be working on.
On the other hand, if you want to run untrusted code, you'd most likely want to do that in a virtual machine with complete software isolation; unikernel apps are supposed to be trusted services set up and run by administrators.
Speaking of which, this is another tricky aspect: managing the things. Unikernel apps are like microservices on steroids. Deploying them, getting them to work together, and so on, in a elegant and scalable manner is really tricky.
And that's where the Docker acquisition comes in.
23-01-2016, 07:37 #4
After Docker took existing and esoteric technology – Linux containers – and made it user friendly, Unikernel Systems hopes the Silicon Valley biz's magic will rub off on the library OS model. The Cambridge team hopes they will, together, be able to find a way to build, deploy and manage unikernel apps using simple commands just like the Docker suite manages containers, hiding away all the fiddly technology we've just spent the past 1,400 words explaining.
In return, Docker will get access to a team of operating system gurus, who will help build out the platform, allowing software to run all sorts of machines and gadgets from Linux and Windows servers to bare-metal hardware and the Internet of Things. Terms of the acquihire – particularly how much was paid – were not disclosed.
Unikernel Systems certainly caught Docker's eye at DockerCon Europe last November. Back then, the UK team demonstrated using Docker's software to spin up a bunch of unikernels, each one in a virtual machine on the KVM hypervisor, to run an Nginx, MySQL and PHP stack: each program ran in a unikernel address space. Crucially, the team used Docker tools to configure the network of this nano-cluster, getting the interfaces talking to each other.
There's also a big community of developers surrounding Docker and its open-source software, and Unikernels Systems hope at least some of those people will get stuck into the unikernel model and help grow it into something robust and ready for production.
There are plenty of unikernel designs out there, one of them being MirageOS, whose core team includes Unikernel Systems staffers and is led by CTO Anil Madhavapeddy. That particular software, written using Ocaml, emerged in 2009.
Run the clock backwards to the late 1990s, and you have Nemesis – an experimental operating system developed by the University of Cambridge, UK, with help from the University of Twente, Netherlands; the University of Glasgow, UK; the Swedish Institute of Computer Science; and Citrix Systems. The OS tried to do as much as possible in user space, pushing a lot of kernel functionality into shared libraries, and leaving little in the privileged kernel space except the bare minimum to run the system.
In a chat this week with The Register, Madhavapeddy joked that the bottom half of Nemesis – the kernel space – became the widely used Xen hypervisor, and the top half – the kernel-as-a-library scheme – is only now emerging as a mainstream design.
"After 15 years, it's good to see some of this hard work in Nemesis appear," he said. "Now every developer can build a unikernel. Docker makes Linux containers so easy, and we want that for unikernels."
Madhavapeddy explained it should be possible to develop, build and test unikernel applications in a familiar development environment by linking against libraries to run the code on an underlying kernel – and then link with the unikernel libraries when it's time to test and deploy on a hypervisor or bare metal.
"With Docker integration, the developer will never have to understand all the details of technology," he added.
David Messina, veep of marketing at Docker, told us: "We hope this collaboration will drive activity around unikernels, and make them more mainstream. This is the early stages, and we are bringing the technology into the Docker project and the Docker engine.
"We hope people will get a feel for it. So much of what we do is community based; we're expecting enthusiasm from developers, and a bunch of new ones from the community to join the effort."
We're told Unikernel Systems team will continue to contribute to open-source projects, including MirageOS and rumprun. You can find out more about unikernels and OS design here.
23-01-2016, 07:39 #5
Randy Bias @randybias 10 hours ago
I don’t think that will end well. Unikernels aren’t new or an advantage over existing type 1 VMMs.
23-01-2016, 08:56 #6
Randy Bias @randybias 12 hours ago
Unikernels have all the drawbacks of regular hypervisors and none of the advantages of containers.