15 Jun 2015
This post is a summary of the ideas I presented in my talk “The rise of containers” for the Dev Talks conference in Bucharest.
Before talking about containers, it’s good to take a look back in history to know where we come from and the problems we’ve tried to solve.
Virtualization started during the sixties, when a lot of code was written for specific architectures mostly in assembler. When new systems were being developed, the industry realized that rewriting lots of software was not an easy task and that software was much harder to replace than hardware. This motivated a search to find a solution to the problem of how to run existing code on new hardware.
The idea of virtualization was then to “emulate” the architecture of the old hardware on top of the new one so code could be run without modification. And so the concept of hardware virtualization was born. We could run a virtual machine on top of our operating system, running the entire stack, operating system, binaries and application as they were originally written.
Of course faking the whole machine with software has a big performance impact but as machines were getting faster even when running at slower speeds the system would run at an acceptable performance level compared to the speed of previous generations.
Since then, a lot of work has been spent on trying to minimize this performance overhead and in recent years processors have incorporated special instruction sets to better support virtual machines, nevertheless these costs still remain, and with ever increasing processing power demands in the era of cloud computing, they add up in volume and are definitely not negligible.
So what if instead of running an entire virtual machine on top of our host operating system, we could run them side by side?. This is the concept of OS level virtualization.
It all started with chroot and then later, FreeBSD jails evolved the concept further, as well as Solaris with their zones, and more recently we have other projects like Open-VZ and LXC.
By running containers as native processes they can run at full speed since there is no overhead of the hypervisor, the host operating system can better share the resources between. A container can require certain amounts of memory and cpu but when not used they can be shared with other containers or be used by the OS itself, for instance for caching purposes. Containers can also start extremely fast since there is no need to boot the guest OS at all.
Of course we’re now limited to running the same OS on the container as in the host machine, making this method of virtualization not suitable for all cases however it is still very powerful. In many cases we run many instances of the same application on the same OS and when we can accept this constraint there is a lot be gained.
All of these projects have proven the concept and have been used by many people but have not gained real widespread adoption.
This is where Docker comes into play and the real revolution it has brought is making this technology accessible to users. Getting started with Docker takes just a matter of minutes, making it very compelling to try and just play with the technology.
Besides being a container technology, Docker is also a platform for developers and sysadmins and it allows for easy sharing and distribution of containers.
This last point is one that I found particularly amazing. Installing a distributed application then becomes a matter of installing a few containers and linking them together. It’s similar to a package manager but on a bigger scale, like apt but for applications. This is an incredibly powerful concept and as the industry continues to move towards service oriented architectures this becomes even more attractive.
As a developer, using Docker has some very compelling use cases, installing external services on the local machine becomes a thing of the past, forget about installing and compiling libraries locally. Keeping a consistent environment between development and production becomes a reality, you can run on your machine for instance the exact same version of your database as in production avoiding small discrepancies that tend to arise at the worst possible moment, when the deployment happens.
For operations it’s also very attractive. For once it allows for standardized environments, running the same software for development, staging and production. It provides great flexibility to distribute where things are running, apps no longer need to be tied to a specific virtual machine that is configured for a specific stack. Since it provides better resource utilization having higher server density can help reduce operation costs, and finally having containers that can be started sometimes in a few milliseconds allows for easily scaling up and down according to the workload and business needs.
In order to make your apps working with containers, there are some architecture constraints that will affect our applications. The Twelve-Factor app defines a methodology to help us deploy our apps for the cloud. While all of them are important I’ll focus on the ones that I think are a must when working with containers.
Store configuration in the environment. Code needs to be separated from configuration, config changes between environments but code does not. Environment variables are easy to set, are language and OS agnostic and are very unlike files are harder to commit to version control accidentally.
Treat backing services as attached resources. Any service your app consumes over the network (mysql, redis, etc) should not be treated differently if it runs locally on your datacenter or by a third party. Backing services should be loosely coupled to your app, and admin should be able to replace one database with another one without code changes.
Execute the app as one or more stateless processes. An app should never assume that anything on memory or disk will be available for a future request or background job. The local file system can be used as a cache for a transaction but any results must be committed to a stateful service like a database. Sessions should be stored in an external datastore, like redis or memcached.
Maximize robustness with fast startup and graceful shutdown. The app process should minimize the time from booting until it’s ready to process work and they should cleanup themselves after they receive a SIGTERM signal, finishing processing the current requests without taking any more, releasing any locks or returning background jobs to their queues. The app should also handle unexpected termination in case of hardware failure for instance.
Treat logs as event streams. The app should not manage log files and just write to the stdout, it’s the responsibility of the platform to route the logs to the appropriate backend so they can be easily retrieved and analyzed.
This is just a brief summary of some of the most important topics nevertheless it’s a good idea to look at the twelve points and try to incorporate them.
I want to bring up some of the challenges regarding containers for the future. As with any new technology it solves some problems but creates new ones as well. There’s a lot of activity in this space right now and new tools appear every week and new insights are gained.
There’s service discovery and registration, how do apps know where other services are and how do they announce themselves to the world. We also have challenges for networking, e.g. how do containers talk to each other across multiple hosts is not a solved problem yet. Persistence also presents its own challenges, how can we run a database with containers, once the data is written to some host, then the container would be bound to that host and that could mean single points of failure. Last but not least, there are some security concerns, like the fact that the docker daemon needs to run as root on the host machine although they’re working on changing this requirement. We’ll hopefully see many of these issues being addressed in the upcoming months.
Docker containers have irrupted into the IT landscape; they’re changing how we think about our infrastructure and they’ll continue to do so in the upcoming years. There are many challenges lying ahead for its massive adoption into production systems and we’ll probably see this concerns diminished in the future. I don’t think they’ll replace completely replace virtual machines since even when there’s some overlap between them, they also address different use cases. We’ll probably see mixed environments with bare metal, vms and containers all playing together.
Containers are here to stay, get ready for the revolution.
19 Dec 2014
During the talk I presented a tool that is helping me focus on developing the different containers and make the changes to the Dockerfile and the apps themselves and leave out the details of using fig.
09 Dec 2014
It’s been over 3 months since my change to the site operations team at Xing and time has really been flying by.
Everything is new and I really don’t know what to do many times and that I find very delightful. The investigation, learning, figuring out how to achieve something, reading forums or stack overflow and trying to fit a puzzle into your head. I think this is what I love the most about computers, that moment when you understand the machine, what it does and why it does it. It’s the hack, making the computer do what you want. Getting out of my comfort zone is helping me appreciate the basics more. [...] read more
03 Jul 2014
A couple months ago I started growing a vegetable garden at home with my girlfriend. It’s been a great experience. She’s always been into plants, we have lots of them around the house but never had we tried to grow food. I think the fact that I can get something out of the experience that is more than just aesthetic but also functional (I can eat the vegetables) has made it more appealing for me.
There have been some experiences so far that I somehow can relate with developing a software project. [...] read more
12 Oct 2013
I was invited with my friend Jean Carlos Meninno to give a presentation on the GDG DevFest Barcelona 2013. It was a great opportunity to talk about the work we’re doing recently for XING and the things we’ve been learning about developing large scale backbone applications.
Here you can find the slides: http://diasjorge.github.io/google-dev-fest-slides/
Hope you like them. [...] read more
10 Sep 2012
During my time working at XING I believe my single biggest contribution for the company is a side project I’ve developed called Xing scripts. This project started with a personal need for working with our development environment in a more automated way. I’m a big proponent of automating everything you can and so when I started working I realized that there were these tasks that I would do over and over again. Since I couldn’t bare doing all this manual work I started writing my own scripts. [...] read more
11 Jul 2012
Yesterday I was at work with a colleague and we wanted to merge a long-running branch we had. This branch was full of useless commits so we wanted to clean it up. We tried an interactive rebase but we got a lot of conflicts since git doesn’t know how to resolve merge conflicts that we had previously fixed. As you probably know this is no fun, so we did what any sane person would do and found a nice solution for this called git-rerere. [...] read more
11 May 2012
Recently I had to reinstall my computer at work since I had to update to Lion and I could only do a fresh install, so I decided to try to automate the installation process since some of my colleagues are going through the same and it seems like every time we have to waste many hours or days to solve the same issues over and over. [...] read more
21 Dec 2011
It is perhaps my experience but I’ve hardly had the opportunity to work on green field projects, but rather worked on legacy ones where most of the original developers were no longer part of the team or even none of them. Projects with little to no documentation and in some cases no tests at all. You probable know this feeling, it sucks, you want to do things but everything you touch breaks something else, where you obviously see that there was lack of care. [...] read more
29 Aug 2011
It’s been some agitated months lately for me, I quitted working at JustLanded after almost two years there and then went working for some consulting, the experience was not so great, actually it was really bad, the kind of experience that has made me learn to choose very carefully my future career moves and never again believe in marketing people. Fortunately I got an offer to work at XING offices in Barcelona, so I packed my stuff and moved there. Now this is a really good place to work, everything was as we talked, they’ve been very helpful with my relocation and the environment is great, lots of smart people that want to do a good job, so nowadays I’m very happy and enjoying my new city. [...] read more
13 Jan 2011
17 Nov 2010
When using passenger with rvm I’ve had some issues with project specific gemsets, where bundler was unable to find the gems. After searching a lot I found out about using the “config/setup_load_paths.rb” file to tell passenger where to locate your gems, but then I had a new issue with rvm trying to use the system ruby instead of the ruby version of my .rvmrc file.
After going to the irc channel, I got some help that help me fixed my problem. The culprit was my rvmrc file. [...] read more
04 Nov 2010
As promised here are the slides for the “Conferencia Rails” workshop on process automation. Thanks to all the people that were there. I’m also releasing the redmine CLI I’ve created along with the CLI twitter client.
The presentation was created using the slideshow gem which generates an html document for you.
Hope you enjoy it.[...] read more
25 May 2010
Today I spent several hours with my friend Gleb trying to find a weird bug we we’re having importing some rss feeds.
We have a rake task that will grab an xml feed and import it to our system. When we call this rake task from the command line it would run fine, but if we run it from inside our application, we would get some wrong characters (you know, the usual ???) in the imported items. [...] read more
11 Apr 2010
If you’re using emacs to write your jekyll blog posts, there is a mode to help you with some common tasks. It is originally from metajack. Recently I thought it could be a nice addition to have syntax highlight support for jekyll posts, so I got my hands dirty and after some hours of lisp hacking (this was my first attempt at lisp programming) it was a reallity. It is based on nxhtml so you need it to work. [...] read more
If you’re using capistrano-ext to deploy to a different server, using a custom environment, you’ve probably noticed that it always tries to run the migrations for the production environment, like this:
cd path_to_app/deploy/releases/20100309152738; rake RAILS_ENV=production db:migrate
Digging through capistrano’s source I found the solution is really simple, just set the rails_env variable to the environment you want, in this example staging. So inside config/deploy/staging.rb
set :rails_env, "staging"
Then when migrations get executed they’ll have RAILS_ENV=staging.[...] read more
08 Mar 2010
Recently I moved my blog to Jekyll, while being able to write stuff directly in my favorite editor EMACS, there was some functionality that I was missing from my previous custom blog engine, such as archives. Looking at how I could achieve this, I found Raoul Felix approach to the problem. Instead of patching jekyll, he wrote a small library that wraps around it, called jekyll_ext. Using it was really easy, and based on some of the extensions he created, I was able to provide this functionality in my site.
Although I had archives generated for me, I was still missing a way to display this information on my site, so I decided to create my own extension. [...] read more
02 Mar 2010
23 Feb 2010
If you ever run into the situation where one migration doesn’t complete sucessfully, and you’re stuck with a column in a table or a new table, so you can’t drop the migration or execute the migration again, you can always call the migration methods from the console like this: [...] read more
16 Dec 2008
I recently had to implement some ajax pagination for a site. After googling for a while I found a solution, but I couldn’t customize the pagination url’s or I had to specify the paginator to use (will paginate’s default or mine for ajax), so I came up with this solution which fulfils all my needs. [...] read more