How We Used Docker to Compile and Run Untrusted Code

by OsmanAli

TL;DR: We developed an open-source sandbox system that allows users to execute code snippets in their browsers, while keeping the server unharmed.

While building remote interview, we faced an interesting challenge. We had to allow our users to compile their code snippets and see the output, without user having to install anything on their systems.

A lot of our competitors must have faced similar problem but none of them have disclosed anything as if this was their secret sauce.

We decided to do just that and release all our findings for anyone to use. Since we heavily believe in open-source and plan to continue contributing back.

 

The Actual Idea

In an ideal world, a simple solution would be a server which compiles the code on user’s behalf and returns the output. Next we had to support multiple languages. For that the server should have all those compilers installed and “API-fied” so the client can send over the code and get the results back.

But there is one problem, you can’t just trust any code your user submits. The main problem is (wait for it!) security. What if the not-so-malicious-looking submitted code is very evil and tries to damage or take control of the server. Also we had to allow native binaries to be run.

Even if you could somehow resolve the security issue using intricate commands to restrict resources and execution time, there was still this issue of efficiency. Focusing on the “complexity is your enemy” phrase, we had to keep efficiency consideration in our minds too! In short we were adamant on making life difficult for ourselves.

The untrusted program needs to have some restrictions, like it shouldn’t have access to server files, should not be able to open ports, should not consume a lot of resources and should terminate after some time. The amount of parsing needed to turn this code into a ‘good-boy’ was interesting and challenging. But like developers usually do, we managed to come up with a solution. And we feel happy about it.

Early Brainstorming

- Chroot jail: What was that first thing that occurred to us? Well, no points for guessing, it was none other than the  chroot jail. But before we could proceed any further we realised this may not be the best way to go. As stated by Josh Bressers on  SecurityBlog . Seeing tutorials teaching us how to break out of a chroot jail proved to be the final nail in the coffin and we decided to try something else.

- Ideone: Does exactly what we wanted. But then things that do exactly what you want don’t really come for free (our solution does though). Ideone, though ideal, turned out to be a rather costly solution in our initial usage. Not to mention that the time Sphere Engine takes to compile a small source code is significant. Needless to say, we decided to move on. 

- Using VmWare/Virtual Box: As safe as you could expect. Does the job pretty well. But in case this virtual machine was compromised, detecting it and recreating it was not ideally time-efficient. Also VirtualBoxes are handled best using a GUI, which again is a constraint, given server communication is done mostly through terminal. So we packed our bags once more and decided to move to the next possible solution.

 

The Breakthrough

We had to be patient, and eventually we managed to find a way through to get the job done. Thanks to Docker.io we managed to build a virtual machine image of Ubuntu and decided to use its instances (the containers) to run the untrusted code. Using Docker provided us with the following benefits

- If any malicious code was to attempt to destroy the system, its effects would remain inside the container it is working in.

- Containers don’t have to be running all the time, they are created when needed, and destroyed when they have completed their job

- Each container is a running instance of one Image, we had all our compilers installed inside the image.

- Once the output file is created. We send its contents back to the client.

Tada! It sounded simple yet effective,  we decided to give it a go.

EDIT: As many of you have (rightfully) pointed out that our considerations regarding “security” part of Docker were somewhat misleading, Docker is good for achieving isolation but not so much in terms of security. However, so far, Docker has managed to do the job for us pretty well. Pull requests are always welcome. Also we are trying to find out ways that could improve the security aspects of the compilation further and looking into integrating SELinux and AppArmor into our code.

Working Our Way Through

Our task at hand was divided into two phases.

1. Create a Dockerfile which installs all the relevant compilers in the container.

2. Make an API/Supervisor which receives code from user, runs it in a Docker container, and gets back the output.

Creating a virtual image inside Docker is easy given that your internet connection is having a good day. You create a Dockerfile and run it with Docker using the “build” command. See this link for a better idea.

Once we completed the Docker layer we moved on to writing a supervisor using shell script and Node.js for API. The supervisor creates new container, compiles the given program using selected compiler. The compilation and execution is carried out inside Docker container. The output is then redirected to a file from where the supervisor reads the contents and returns them to the client app.

The execution time is quick, the containers usage is sleek and the Virtual image itself is pretty lightweight.

Here is a visual for the whole system.

Arch.png

 

Saying Goodbye

With everything said and done all that is left for you is to experience the finished product and see it working here and here. Alternatively, you can find the source code at github and let us know of what you think. Thanks in advance for any bug reports!

That would be that,

Cheers!

keep-calm-and-compile-the-code.png