Motivation
Many Bioinformatics workflows involve large datasets requiring high performance computing. Cloud computing provides researchers with the ability to perform computations using a practically unlimited pool of virtual machines, using platforms such as Amazon EC2, Eucalyptus or VirtualBox. CloudBioLinux utilizes these resources to enable instant access to biological software, programming libraries and data.
CloudBioLinux is a community project and we welcome contributors and feedback. A fully automated infrsatructure installs software and data, with packages specified in simple configuration files. Please fork our code on GitHub and suggest improvements and additions.
These resources are designed for biologists as well as programmers. With the help of the NEBC Bio-Linux development team, images include biological software and libraries available in local installations along with a FreeNX desktop environment designed to ease the transition to remote computational analysis.
Resources
- Amazon EC2 (us-east-1): ami-46d4792f -- Ubuntu 12.04 64bit (27 June 2012)
- These images work with micro instances, a cost effective way to explore Amazon resources.
- Cloud BioLinux can be executed on a desktop computer without the need for a cloud:
- Cloud BioLinux 32-bit VirtualBox appliance Information for end-users for installing Virtualbox and importing Cloud BioLinux images for execution within Virtualbox.
- Cloud BioLinux images are available for execution on a private Eucalyptus cloud:
- Cloud BioLinux 32-bit Eucalyptus .img The .img file can be uploaded to your Eucalyptus cloud by simply running "uec-publish-image file.img" from the Ubuntu cloud-utils
- Indexed genome builds for aligners such as Bowtie, BWA and Novoalign in the biodata S3 bucket. An automated script pulls the requested genomes and aligner indexes to an Amazon machine or your local computer; integration with Galaxy is also provided.
Documentation
- BioCloudCentral: Easily launch CloudBioLinux and CloudMan clusters
- Using Cloud Computing Infrastructure with CloudBioLinux, CloudMan, and Galaxy: Detailed how-tos
- Getting Started with CloudBioLinux: a gentle guide for new users.
- Accessing 1000 Human Genomes data using CloudBioLinux: Part 1: Starting, Part 2: Accessing data.
- Deploying production Galaxy instances using CloudBioLinux and CloudMan: documentation, slides
- Building analysis pipelines with CloudBioLinux and CloudMan
- Build framework: a high level overview of the automated build environment.
- Developer build documentation: details on using and contributing to the framework.