This is detailed documentation about managing your Docker image for your Native Application in BaseSpace. This document assumes that the developer has already set up the local development environment for Native Apps. Please ensure that you have the Native Apps virtual machine running locally before proceeding.
Native apps are composed of three essential components: the input form, the docker image, and the output report. A Docker image is essentially a lightweight Virtual Machine with only the filesystem and dependencies for the app installed, without any system level configuration. The Docker image is stored in a Docker Registry where BaseSpace can access it.
The Docker IO Project was started very recently and is a new concept that many developers are not familiar with. We implore developers to spend some time looking into Docker to understand how it works. The Docker website has extensive documentation that is both comprehensive and easy to follow with great examples. This guide will have many common use cases highlighted with Docker, but for more advanced use of Docker please refer to the Docker Documentation.
Docker offers a great Interactive Tutorial on their website that will take you through the fundamentals of Docker. We encourage developers to try out this interactive tutorial if they are not familiar with Docker. In a few quick steps, you'll learn the basics of Docker!
There is a also great Docker tutorial available at http://www.coolgarif.com/brain-food/using-docker-as-a-development-environment which will familiarize new developers with the concept of Docker and how to use it as a development environment.
Think of your Docker image as a shipping container for your application. That shipping container can hold all different types of code, files, and dependencies that are needed to run your script. The contents of the container can vary, but the Docker images (containers) for BaseSpace apps can only be run in Linux environments.
When you are working with Docker locally, here are a few things to note:
Docker is installed and running inside of your local Native Apps Virtual Machine in Virtual Box
You have to SSH into your VM, and then you can run a Docker image within that VM
When you run a Docker image locally, it is an interactive environment, so you can add files, mount folders, and much more for your application
Once you are satisfied with your Docker application and have tested it with BaseSpace, you can package the Docker image and "push" it to the Docker registry (similar to the github architecture)
BaseSpace can then access the Docker image for your application and run it in the BaseSpace environment on an AWS machine with the following computational stats:
Native Apps AWS Instance Type(s) | ||||||||
---|---|---|---|---|---|---|---|---|
Instance Family | Instance Type | Processor Arch | vCPU | ECU | Memory (GiB) | Instance Storage (GB) | EBS-optimized Available | Network Performance |
General Purpose | cc2.8xlarge | 64-bit | 32 | 88 | 60.5 | 4x840 | - | 10 Gigabit4 |
Note: For now, only one instance type is available, but in the future the developer will be able to specify their compute requirements via the callbacks.js
script in Forms builder.
The following slides on the Docker website provide an overview of Docker:
http://www.docker.io/learn_more/
This is a cheat sheet of the most common Docker terms that will be used throughout the documentation.
The Docker service is the Docker daemon that is running on the Native Apps Virtual Machine. While this service is running, it will allow you to make all of the Docker commands listed below on that machine. The Docker service is also installed on the AMI that BaseSpace will spin up when the app is launched in the BaseSpace/Amazon cloud infrastructure.
A Docker image is a file system that wraps up your software or analysis pipeline. The image is a collection of all of the files that make up the app, and each change to the original image is stored as a separate layer. Each time you commit to a docker image, you are creating a new layer on the docker image, but the original image and each layer remain unchanged.
Docker images are described in more detail in the Docker Documentation.
A Docker Registry is where Docker Images can be stored, and eventually accessed by BaseSpace on a user's behalf when running an app. Using the docker push
command, you can send your docker image to the Registry to be stored and saved. A Docker Image is stored within a Repository in the Docker Registry. Each Repository is unique for each user or account.
There are two registries available for BaseSpace apps, one is the Public Docker Registry and the other is the BaseSpace Private Docker Registry.
A Docker Repository is a namespace that is used to store a Docker Image. For instance, if your app is named helloworld
and your username or namespace for the Registry is test
, the Docker Repository where this image would be stored in the Docker Registry would be named test/helloworld
.
This is a cheat sheet of the most common Docker commands that you will use while developing your Native App. For more detailed documentation of Docker commands, please refer to the Docker Documentation.
sudo docker pull
: download a docker image from the docker registrysudo docker images
: list all of the locally downloaded images sudo docker images --digests=true
: list all of the locally downloaded Docker images with their digests. This is needed for application publishing in BaseSpace.sudo docker run –i –t repo_name command_to_run
: run a docker image interactively with a certain command and jump into the running container. Repo_name = [docker_username]/[docker_image_name]
. Command_to_run = the command to be run in the interactive docker container, commonly we use the /bin/bash command.
sudo docker ps –a
: list all of the local Docker containers (that were previously executed with a Run command)sudo docker commit container_id image_name
: commit changes made to a local docker container to save any changes made locally. The container_id can be found from the above sudo docker ps –a
command. The image_name is the name of the local image where this change should be applied, this is the same format as the repo_name above.
sudo docker push repo_name
: push all locally committed changes to the docker registry and update the docker image specified by the repo_name with the changes. Repo_name = [docker_username]/[docker_image_name]
sudo docker stop $(sudo docker ps –a -q)
: stop all locally running docker containerssudo docker rm $(sudo docker ps –a -q)
: remove/delete all local docker containerssudo docker rm $(sudo docker images -q)
: remove all docker images from your local machineBefore you begin using the following, you must have the local development environment set up for Native Applications. In addition, the Native Apps Virtual Machine needs to be installed and running. When you have the terminal where you are logged into the virtual machine via SSH, you will be able to interact with Docker.
If you are not yet at this state, please refer to Setting Up Your Native Dev Environment in the documentation on the developer portal.
The Docker Registry is where Docker Images are stored. For BaseSpace Native Apps, both the Public Docker Registry and a Private Docker Registry are available for use.
The Public Docker Registry is hosted by Docker at index.docker.io. All images stored in the Public Docker Registry can be discovered by other Docker users via index.docker.io.
Privacy
These images are all public, any other user may pull down your image(s). Other users do not have the ability to modify any images stored within your Docker repository.
How to Push to the Public Docker Registry
Once you are ready to store your docker image in the Public Docker Registry, you can simply type
sudo docker push repository_name
Where repositoryname is specified locally when the image is committed. The repositoryname, when using the Public Docker Registry, is the the following structure [username]/[image_name]
.
Note: If this is your first time pushing to the Public Registry, you will be asked to create a username, password, and to register an email address. Before you can push, you will have to confirm your email address.
How to Pull from the Public Docker Registry
To pull an image from the Public Docker Registry, you can simply type
sudo docker pull repository_name
Where repositoryname is specified locally when the image is committed. The repositoryname, when using the Public Docker Registry, is the the following structure [username]/[image_name]
.
The pull request will begin pulling all of the layers of the image locally. Some images are larger than others so this request may take some time to complete.
The BaseSpace Private Docker Registry is hosted and maintained by the BaseSpace team. It is hosted at docker.illumina.com.
Privacy
All images stored in the Private Docker Registry are accessible only by the developer. A repository is created in BaseSpace via the developer portal where the developer can then push and store their images.
Creating Your Docker Repository
Click on Docker Repositories
Enter a new namespace. You will store all of your docker images under this namespace once it is created. This is analogous to the Docker username for the Public Registry but is not used to log in to the Private Registry. Namespaces must be unique across all namespaces in the Private Registry, and there are some that are restricted from use (e.g. basespace.)
Click Create Namespace
The namespace is now created and you may store your images in the BaseSpace Private Docker Registry
How to Log in to the Private Docker Registry
Logging in to the Private Registry is slightly different than logging in to the Public Registry. When asked to log in to the Private Registry, you will be required to provide the following:
With the Public Registry, the Username and Email are different and the Username is the namespace. With the Private Registry, the Username and Email are the same and the Username is not the namespace.
How to Push to the Private Docker Registry
Once you are ready to store your docker image in the Private Docker Registry, you can simply type
sudo docker push docker.illumina.com/repository_name
Where docker.illumina.com/repositoryname is specified locally when the image is committed. The repositoryname, when using the Private Docker Registry, is the the following structure [namespace]/[image_name]
. The namespace is created in the Developer Portal.
Alternatively, you can also tag the images with docker.illumina.com and push them with just the repository_name. This can be used to easily push between the Public and Private Docker Registries.
Note: You will be asked to login with a username, password, and an email address.
How to Pull from the Private Docker Registry
To pull an image from the Private Docker Registry, you can simply type
sudo docker pull docker.illumina.com/repository_name
Where docker.illumina.com/repositoryname is specified locally when the image is committed. The repositoryname, when using the Private Docker Registry, is the the following structure [namespace]/[image_name]
. The namespace is created in the Developer Portal.
The pull request will begin pulling all of the layers of the image locally. Some images are larger than others so this request may take some time to complete.
To begin creating your own Docker container, you can use the existing hello world Docker container as a base image or you can use a different existing container in the Docker registry.
There are many base containers that exist in the public Docker registry, here are two that are good starting points, the ubuntu
and the busybox
container.
When you are logged in to the Virtual Machine terminal, type the following:
sudo docker pull ubuntu
This will download (pull) down the ubuntu
Docker container to your locally running Native App Virtual Machine. This container has a command called echo
in its bin
folder, we can run this command using the following:
sudo docker run ubuntu /bin/echo hello world
Definitions:
sudo
- means to execute the following command as the root
user of the Virtual Machine
docker run
- means to run the command in a new container
ubuntu
- the name of the image that we want to run the command inside of
/bin/echo
- the command that we want to run in the container
hello world
- is the input for the echo
command
The response should echo back hello world
. This means that you have the basic image downloaded.
For more detailed information, please refer to the Docker Hello World Documentation.
To pull a different Docker image from the public Docker registry, simply type in the following:
sudo docker pull [username]/[image_name]
A list of base Docker images created by the Docker team are listed here: https://index.docker.io/u/library/. Any of these images can be used as a base image for a BaseSpace Native App.
Example
If we want to pull down the basespace/fastqc
Docker image, basespace
is the Docker username and fastqc
is the name of the Docker image
sudo docker pull basespace/fastqc
This will download the fastqc
Docker image to your local Virtual Machine. Many other Docker images can be found in the public Docker registry.
Here is an example of a Python web app using an existing Docker image as the base:
http://docs.docker.io/en/latest/examples/python_web_app/
When you are in the terminal where you are logged into the virtual machine via SSH and have chosen and downloaded a base Docker IMAGE
from the public repository that you would like use as your starting point to your Native App Virtual Machine, you are ready to begin using the interactive Docker environment to start building your new image!
The steps to create your new Docker image for your application are easy:
Run your base Docker image in interactive mode using -i
in the run
command
-v /host_folder:/image_folder
where host_folder
is the name of the folder in the VM and image_folder
is the name of the folder that you wish to copy the files to. attach
to the Docker run process
Now you are in the Docker container, you can do a number of actions
apt-get
Once you are satisfied with the changes you have made to the container and have tested your application locally through the Send to Local Agent feature in the Formbuilder tool, you can commit
your image changes locally and then push
those changes to a new or existing image in the public Docker repository (and in the future, the private BaseSpace Docker registry)
Now the image is accessible by BaseSpace and, once you are ready to have it reviewed, we can review and publish the application for all users in BaseSpace
Provided that the name of the Docker image that you wish to work with interactively is IMAGE
, you can use the following commands to run an interactive container for that Docker image:
ID=$(sudo docker run -i -t -d IMAGE /bin/bash)
sudo docker attach $ID
Note: If there is a folder that is mounted to the Native Virtual Machine that you wish to mount onto your Docker image in the interactive state, simply add -v /host_folder:/image_folder
to the run command where host_folder
is the name of the folder in the VM and image_folder
is the name of the folder that you wish to copy the files to.
Example with mounted folder:
ID=$(sudo docker run -i -t -d -v /host_folder:/image_folder IMAGE /bin/bash)
sudo docker attach $ID
Your terminal will now change to the root
user on the Virtual Machine and you will be in the Docker container. Now you can use the following common Docker commands to interact with your Docker image and prepare it to be a BaseSpace application. -v /host_folder:/image_folder:ro
will make this new folder read-only.
All Ubuntu commands are supported in the terminal. Packages can be installed using apt-get
to the Docker container, here is an example where the VIM editor is installed:
apt-get install vim
You can also install ftp clients or the scp
command to transfer files onto your Docker container:
scp mtyagi@sd-qmaster:/home/mtyagi/somefile.txt
Install anything your application needs in order to run, including SDKs, files, reference genomes, and more.
Use Shared Folders feature in Virtual Box to share a folder between your host machine and the Native App Virtual Machine.
Once the share is created, create the folder you wish to copy the mount files into on the Virtual Machine.
After the folder is created, use
sudo mount -t vboxsf -o uid=$UID,gid=$GID share [destination_folder_name]
to mount that volume to the destination folder on the Virtual Machine.
Now, add the mounted folder to the run
command above when creating an interactive Docker container as shown in that portion of this guide. You will now be able to access this data and copy it into your Docker image for the app's use!
In your Native Apps Virtual Machine, you will find the genomes folder. In this folder, on your local Virtual Machine, you will find only the Phix reference genome. To reduce the size of the Virtual Machine, we have only limited genomes available for local testing. However, we are working on integrating all of iGenomes and many other references into our genomes folder.
In addition, the genomes folder in the Virtual Machine that runs in BaseSpace (instead of locally) has a more robust collection of reference genomes. We will post a list of what we have available shortly.
The genomes folder can be duplicated on your Docker image by following the Adding Files to your Image above.
Once you have finished creating and testing your Native App locally and would like to commit your changes, first exit the Docket container by typing:
exit
Now, you should be in the Virtual Machine's terminal, type the following:
sudo docker commit $ID [docker_username]/[image_name]
For example, to commit changes as the user basespace
for the fastqc_demo
app:
ID=$(sudo docker run -i -t -d basespace/fastqc_demo /bin/bash)
sudo docker commit $ID basespace/image_name
Where image_name
can be fastqc_demo
or a new name.
Note: Ensure that unwanted files are removed from your Docker image before committing or pushing these changes because all of those files will be bundled into the Docker image.
Once you are satisfied with your committed changes, it is time to push those changes to the Docker registry so that you can point BaseSpace to your application, publish the app, and let BaseSpace manage the rest.
To send your Docker image to the public Docker registry, simply type the following:
sudo docker push [docker_username]/[image_name]
For example, to push changes for the basespace/fastqc_demo
app:
sudo docker push basespace/fastqc_demo
At the moment, in Docker, there is a limit on the number of aufs layers that can exist in the Docker image. This number is 42, and essentially a new layer is created on each commit. However, there are a few workarounds that have been implemented by the community that are discussed here: https://github.com/dotcloud/docker/issues/1171.
sudo docker export $CONTAINER_ID > image.tar
cat image.tar > sudo docker import - image_flat.tar
Other Docker developers have reported issues when trying to push large commit layers to the Docker registry, this sometimes results in a failure to push the new Docker image to the registry. Docker is aware of this and actively working on a solution.
At the moment, we do not allow outgoing requests or connections from your Docker image. We are working on creating a white list of web domains that Native Apps can access but this is not yet in place.