How to Manage your Docker Image

This is detailed documentation about managing your Docker image for your Native Application in BaseSpace. This document assumes that the developer has already set up the local development environment for Native Apps. Please ensure that you have the Native Apps virtual machine running locally before proceeding.


Getting Familiar with Docker

Native apps are composed of three essential components: the input form, the docker image, and the output report. A Docker image is essentially a lightweight Virtual Machine with only the filesystem and dependencies for the app installed, without any system level configuration. The Docker image is stored in a Docker Registry where BaseSpace can access it.

The Docker IO Project was started very recently and is a new concept that many developers are not familiar with. We implore developers to spend some time looking into Docker to understand how it works. The Docker website has extensive documentation that is both comprehensive and easy to follow with great examples. This guide will have many common use cases highlighted with Docker, but for more advanced use of Docker please refer to the Docker Documentation.

Docker offers a great Interactive Tutorial on their website that will take you through the fundamentals of Docker. We encourage developers to try out this interactive tutorial if they are not familiar with Docker. In a few quick steps, you'll learn the basics of Docker!

There is a also great Docker tutorial available at http://www.coolgarif.com/brain-food/using-docker-as-a-development-environment which will familiarize new developers with the concept of Docker and how to use it as a development environment.


Quick Docker Crash Course

Think of your Docker image as a shipping container for your application. That shipping container can hold all different types of code, files, and dependencies that are needed to run your script. The contents of the container can vary, but the Docker images (containers) for BaseSpace apps can only be run in Linux environments.

When you are working with Docker locally, here are a few things to note:

  1. Docker is installed and running inside of your local Native Apps Virtual Machine in Virtual Box

  2. You have to SSH into your VM, and then you can run a Docker image within that VM

  3. When you run a Docker image locally, it is an interactive environment, so you can add files, mount folders, and much more for your application

  4. Once you are satisfied with your Docker application and have tested it with BaseSpace, you can package the Docker image and "push" it to the Docker registry (similar to the github architecture)

  5. BaseSpace can then access the Docker image for your application and run it in the BaseSpace environment on an AWS machine with the following computational stats:

Native Apps AWS Instance Type(s)
Instance Family Instance Type Processor Arch vCPU ECU Memory (GiB) Instance Storage (GB) EBS-optimized Available Network Performance
General Purpose cc2.8xlarge 64-bit 32 88 60.5 4x840 - 10 Gigabit4

Note: For now, only one instance type is available, but in the future the developer will be able to specify their compute requirements via the callbacks.js script in Forms builder.

The following slides on the Docker website provide an overview of Docker:

http://www.docker.io/learn_more/


Docker Term Cheat Sheet

This is a cheat sheet of the most common Docker terms that will be used throughout the documentation.

Docker Service

The Docker service is the Docker daemon that is running on the Native Apps Virtual Machine. While this service is running, it will allow you to make all of the Docker commands listed below on that machine. The Docker service is also installed on the AMI that BaseSpace will spin up when the app is launched in the BaseSpace/Amazon cloud infrastructure.

Docker Image

A Docker image is a file system that wraps up your software or analysis pipeline. The image is a collection of all of the files that make up the app, and each change to the original image is stored as a separate layer. Each time you commit to a docker image, you are creating a new layer on the docker image, but the original image and each layer remain unchanged.

Docker images are described in more detail in the Docker Documentation.

Docker Registry

A Docker Registry is where Docker Images can be stored, and eventually accessed by BaseSpace on a user's behalf when running an app. Using the docker push command, you can send your docker image to the Registry to be stored and saved. A Docker Image is stored within a Repository in the Docker Registry. Each Repository is unique for each user or account.

There are two registries available for BaseSpace apps, one is the Public Docker Registry and the other is the BaseSpace Private Docker Registry.

Docker Repository

A Docker Repository is a namespace that is used to store a Docker Image. For instance, if your app is named helloworld and your username or namespace for the Registry is test, the Docker Repository where this image would be stored in the Docker Registry would be named test/helloworld.


Docker Command Cheat Sheet

This is a cheat sheet of the most common Docker commands that you will use while developing your Native App. For more detailed documentation of Docker commands, please refer to the Docker Documentation.

Docker Pull

  • sudo docker pull : download a docker image from the docker registry

Docker Images (List Images)

  • sudo docker images : list all of the locally downloaded images
  • sudo docker images --digests=true : list all of the locally downloaded Docker images with their digests.

Docker Run

  • sudo docker run –i –t repo_name command_to_run : run a docker image interactively with a certain command and jump into the running container.

Repo_name = [docker_username]/[docker_image_name]. Command_to_run = the command to be run in the interactive docker container, commonly we use the /bin/bash command.

Docker ps -a (List Containers)

  • sudo docker ps –a : list all of the local Docker containers (that were previously executed with a Run command)

Docker Commit (Save Changes Locally)

  • sudo docker commit container_id image_name : commit changes made to a local docker container to save any changes made locally.

The container_id can be found from the above sudo docker ps –a command. The image_name is the name of the local image where this change should be applied, this is the same format as the repo_name above.

Docker Push (Save Updated Image in Docker Registry)

  • sudo docker push repo_name : push all locally committed changes to the docker registry and update the docker image specified by the repo_name with the changes.

Repo_name = [docker_username]/[docker_image_name]

Docker Stop

  • sudo docker stop $(sudo docker ps –a -q) : stop all locally running docker containers

Docker rm (Remove/Delete Docker Containers or Images)

  • sudo docker rm $(sudo docker ps –a -q) : remove/delete all local docker containers
    • sudo docker rm $(sudo docker images -q) : remove all docker images from your local machine

Prerequisites

Before you begin using the following, you must have the local development environment set up for Native Applications. In addition, the Native Apps Virtual Machine needs to be installed and running. When you have the terminal where you are logged into the virtual machine via SSH, you will be able to interact with Docker.

If you are not yet at this state, please refer to Setting Up Your Native Dev Environment in the documentation on the developer portal.


Docker Registries

The Docker Registry is where Docker Images are stored. For BaseSpace Native Apps, both the Public Docker Registry and a Private Docker Registry are available for use.

Public Docker Registry

The Public Docker Registry is hosted by Docker at index.docker.io. All images stored in the Public Docker Registry can be discovered by other Docker users via index.docker.io.

Privacy

These images are all public, any other user may pull down your image(s). Other users do not have the ability to modify any images stored within your Docker repository.

How to Push to the Public Docker Registry

Once you are ready to store your docker image in the Public Docker Registry, you can simply type

sudo docker push repository_name

Where repositoryname is specified locally when the image is committed. The repositoryname, when using the Public Docker Registry, is the the following structure [username]/[image_name].

Note: If this is your first time pushing to the Public Registry, you will be asked to create a username, password, and to register an email address. Before you can push, you will have to confirm your email address.

How to Pull from the Public Docker Registry

To pull an image from the Public Docker Registry, you can simply type

sudo docker pull repository_name

Where repositoryname is specified locally when the image is committed. The repositoryname, when using the Public Docker Registry, is the the following structure [username]/[image_name].

The pull request will begin pulling all of the layers of the image locally. Some images are larger than others so this request may take some time to complete.

BaseSpace Private Docker Registry

The BaseSpace Private Docker Registry is hosted and maintained by the BaseSpace team. It is hosted at docker.illumina.com.

Privacy

All images stored in the Private Docker Registry are accessible only by the developer. A repository is created in BaseSpace via the developer portal where the developer can then push and store their images.

Creating Your Docker Repository

  1. Go to the BaseSpace Developer Portal
  2. Click on My Apps
  3. Click on Docker Repositories

  4. Enter a new namespace. You will store all of your docker images under this namespace once it is created. This is analogous to the Docker username for the Public Registry but is not used to log in to the Private Registry. Namespaces must be unique across all namespaces in the Private Registry, and there are some that are restricted from use (e.g. basespace.)

  5. Click Create Namespace

  6. The namespace is now created and you may store your images in the BaseSpace Private Docker Registry

How to Log in to the Private Docker Registry

Logging in to the Private Registry is slightly different than logging in to the Public Registry. When asked to log in to the Private Registry, you will be required to provide the following:

  • Username: This is your BaseSpace username, in all cases this is the email address that you use to log in to BaseSpace
  • Password: This is your BaseSpace password, it is the password you use to log in to your BaseSpace account
  • Email: This is your BaseSpace email, in most cases this will be the same as your Username.

With the Public Registry, the Username and Email are different and the Username is the namespace. With the Private Registry, the Username and Email are the same and the Username is not the namespace.

How to Push to the Private Docker Registry

Once you are ready to store your docker image in the Private Docker Registry, you can simply type

sudo docker push docker.illumina.com/repository_name

Where docker.illumina.com/repositoryname is specified locally when the image is committed. The repositoryname, when using the Private Docker Registry, is the the following structure [namespace]/[image_name]. The namespace is created in the Developer Portal.

Alternatively, you can also tag the images with docker.illumina.com and push them with just the repository_name. This can be used to easily push between the Public and Private Docker Registries.

Note: You will be asked to login with a username, password, and an email address.

How to Pull from the Private Docker Registry

To pull an image from the Private Docker Registry, you can simply type

sudo docker pull docker.illumina.com/repository_name

Where docker.illumina.com/repositoryname is specified locally when the image is committed. The repositoryname, when using the Private Docker Registry, is the the following structure [namespace]/[image_name]. The namespace is created in the Developer Portal.

The pull request will begin pulling all of the layers of the image locally. Some images are larger than others so this request may take some time to complete.


Modify an Existing Docker Image to Create Your Own

To begin creating your own Docker container, you can use the existing hello world Docker container as a base image or you can use a different existing container in the Docker registry.

Running Hello World on the base Ubuntu Image

There are many base containers that exist in the public Docker registry, here are two that are good starting points, the ubuntu and the busybox container.

When you are logged in to the Virtual Machine terminal, type the following:

sudo docker pull ubuntu

This will download (pull) down the ubuntu Docker container to your locally running Native App Virtual Machine. This container has a command called echo in its bin folder, we can run this command using the following:

sudo docker run ubuntu /bin/echo hello world

Definitions:

  • sudo - means to execute the following command as the root user of the Virtual Machine

  • docker run - means to run the command in a new container

  • ubuntu - the name of the image that we want to run the command inside of

  • /bin/echo - the command that we want to run in the container

  • hello world - is the input for the echo command

The response should echo back hello world. This means that you have the basic image downloaded.

For more detailed information, please refer to the Docker Hello World Documentation.

Pull another Docker Image

To pull a different Docker image from the public Docker registry, simply type in the following:

sudo docker pull [username]/[image_name]

A list of base Docker images created by the Docker team are listed here: https://index.docker.io/u/library/. Any of these images can be used as a base image for a BaseSpace Native App.

Example

If we want to pull down the basespace/fastqc Docker image, basespace is the Docker username and fastqc is the name of the Docker image

sudo docker pull basespace/fastqc

This will download the fastqc Docker image to your local Virtual Machine. Many other Docker images can be found in the public Docker registry.

Here is an example of a Python web app using an existing Docker image as the base:

http://docs.docker.io/en/latest/examples/python_web_app/

Create your Own Docker Image

When you are in the terminal where you are logged into the virtual machine via SSH and have chosen and downloaded a base Docker IMAGE from the public repository that you would like use as your starting point to your Native App Virtual Machine, you are ready to begin using the interactive Docker environment to start building your new image!

The steps to create your new Docker image for your application are easy:

  1. Run your base Docker image in interactive mode using -i in the run command

    • Use the Virtual Box Shared Folders mounting feature which allows you to first mount a folder to the Native VM and then to copy the contents of that mount over to your Docker image by adding -v /host_folder:/image_folder where host_folder is the name of the folder in the VM and image_folder is the name of the folder that you wish to copy the files to.
  2. attach to the Docker run process

  3. Now you are in the Docker container, you can do a number of actions

    • Install dependencies by using apt-get
    • Install and run VIM to edit files within your new image
    • Use an FTP server to pull necessary files into your Docker image
  4. Once you are satisfied with the changes you have made to the container and have tested your application locally through the Send to Local Agent feature in the Formbuilder tool, you can commit your image changes locally and then push those changes to a new or existing image in the public Docker repository (and in the future, the private BaseSpace Docker registry)

Start an interactive Docker container from a downloaded image and attach to the process

Provided that the name of the Docker image that you wish to work with interactively is IMAGE, you can use the following commands to run an interactive container for that Docker image:

ID=$(sudo docker run -i -t -d IMAGE /bin/bash)

sudo docker attach $ID

Note: If there is a folder that is mounted to the Native Virtual Machine that you wish to mount onto your Docker image in the interactive state, simply add -v /host_folder:/image_folder to the run command where host_folder is the name of the folder in the VM and image_folder is the name of the folder that you wish to copy the files to.

Example with mounted folder:

ID=$(sudo docker run -i -t -d -v /host_folder:/image_folder IMAGE /bin/bash)

sudo docker attach $ID  

Your terminal will now change to the root user on the Virtual Machine and you will be in the Docker container. Now you can use the following common Docker commands to interact with your Docker image and prepare it to be a BaseSpace application. -v /host_folder:/image_folder:ro will make this new folder read-only.


Useful Ubuntu Commands

All Ubuntu commands are supported in the terminal. Packages can be installed using apt-get to the Docker container, here is an example where the VIM editor is installed:

apt-get install vim

You can also install ftp clients or the scp command to transfer files onto your Docker container:

scp mtyagi@sd-qmaster:/home/mtyagi/somefile.txt

Install anything your application needs in order to run, including SDKs, files, reference genomes, and more.


Adding files to your Image

Use Shared Folders feature in Virtual Box to share a folder between your host machine and the Native App Virtual Machine.

Once the share is created, create the folder you wish to copy the mount files into on the Virtual Machine.

After the folder is created, use

sudo mount -t vboxsf -o uid=$UID,gid=$GID share [destination_folder_name]

to mount that volume to the destination folder on the Virtual Machine.

Now, add the mounted folder to the run command above when creating an interactive Docker container as shown in that portion of this guide. You will now be able to access this data and copy it into your Docker image for the app's use!


Genomes Folder

In your Native Apps Virtual Machine, you will find the genomes folder. In this folder, on your local Virtual Machine, you will find only the Phix reference genome. To reduce the size of the Virtual Machine, we have only limited genomes available for local testing. However, we are working on integrating all of iGenomes and many other references into our genomes folder.

In addition, the genomes folder in the Virtual Machine that runs in BaseSpace (instead of locally) has a more robust collection of reference genomes. We will post a list of what we have available shortly.

The genomes folder can be duplicated on your Docker image by following the Adding Files to your Image above.


Commit Changes to your Image Locally

Once you have finished creating and testing your Native App locally and would like to commit your changes, first exit the Docket container by typing:

exit

Now, you should be in the Virtual Machine's terminal, type the following:

sudo docker commit $ID [docker_username]/[image_name]

For example, to commit changes as the user basespace for the fastqc_demo app:

ID=$(sudo docker run -i -t -d basespace/fastqc_demo /bin/bash)

sudo docker commit $ID basespace/image_name

Where image_name can be fastqc_demo or a new name.

Note: Ensure that unwanted files are removed from your Docker image before committing or pushing these changes because all of those files will be bundled into the Docker image.


Push changes to your Docker Image to the Docker Registry

Once you are satisfied with your committed changes, it is time to push those changes to the Docker registry so that you can point BaseSpace to your application.

To send your Docker image to the public Docker registry, simply type the following:

sudo docker push [docker_username]/[image_name]

For example, to push changes for the basespace/fastqc_demo app:

sudo docker push basespace/fastqc_demo

Known Docker Limitations

  1. At the moment, in Docker, there is a limit on the number of aufs layers that can exist in the Docker image. This number is 42, and essentially a new layer is created on each commit. However, there are a few workarounds that have been implemented by the community that are discussed here: https://github.com/dotcloud/docker/issues/1171.

    • Exporting and Importing the image is one solution, but this results in loss of history on image commits so it is not idea, but it is discussed by solomonstre in the thread linked above.
    • Here is the general export/import workflow in Docker:
      1. sudo docker export $CONTAINER_ID > image.tar
      2. cat image.tar > sudo docker import - image_flat.tar
  2. Other Docker developers have reported issues when trying to push large commit layers to the Docker registry, this sometimes results in a failure to push the new Docker image to the registry. Docker is aware of this and actively working on a solution.

  3. At the moment, we do not allow outgoing requests or connections from your Docker image. We are working on creating a white list of web domains that Native Apps can access but this is not yet in place.