Galaxy Community Conference 2015

UK Conference on the Galaxy Bioinformatics platform.

This is being held at  John Innes Conference Centre 6-8th July 2015 (And is organised by The Sainsbury Laboratory)
Early Bird registration until May 23rd – Cost £47 – £335 (depending upon selected options and status)
Normal registration – cost £65 – £455 (depending upon selected options and status)

For more information on Conference & training visit the GCC web-site.

The Conference features TWO training days with topics including:

Introduction to Galaxy

New to Galaxy? This will introduce you to the Galaxy Project, the Galaxy Community, and walk you through a simple use case demonstrating what Galaxy can do. This session is recommended for anyone who has not used, or only rarely uses Galaxy.


  • Little or no knowledge of Galaxy

 Finding causative mutations in genomes with a Candidate SNP approach

Mapping mutations by position, either using classical methods or whole genome high-throughput sequencing (HTS), largely relies on the analysis of genome-wide polymorphisms in F2 recombinant populations.

We will study high-throughput genomic sequence from genomes of back-and out-crossed bulks of plants to identify a genetic mutation caused by EMS mutagenisation of bulk segregants. The workflow demonstrated and implemented by the attendees will QC paired Illumina reads and align them against the Arabidopsis reference genome using BWA, generate a BAM file, identify SNPs using SAMtools and separate SNPs by allele frequency. We will then use SNPeff to annotate SNPs as to their effect and location in genes and generate plots that will allow us to compare the relative densities of SNP classes across the genome and reveal the candidate positions of the causative mutation.


  1. General knowledge of Galaxy, or attendance at the “Introduction to Galaxy” session.
  2. Basic understanding of genetics.

 RNA-Seq Analysis with Galaxy

This hands-on workshop will demonstrate basic RNA-Seq analysis pipelines including quality control, alignment, and differential expression analysis in Galaxy.

Sample datasets small enough to be successfully processed during the course of the seminar will be provided. Participants will perform the analyses themselves on the provided cloud instance of Galaxy.


  1. General knowledge of Galaxy, or attendance at the “Introduction to Galaxy” session.

Advanced Workflows and Variables

This workshop will teach participants all they need to know in order to create their own publication and/or production quality Galaxy Workflows.

  1. Basic and Advanced Workflow Editor functions.
  2. Demystify the magic variables defined by the Workflow’s engine with a special emphasis on how to track data inputs and outputs: utilize labels inherited from existing datasets, prompt for user-defined labels, and/or create custom-specified labels (or portions of labels) within the Workflow itself.
  3. Hands-on examples for batch processing, including how to execute using multiple input streams or Dataset Collections.
  4. Tips for preparing a Workflow so it may be used effectively by others: annotation options, run-time parameter changes, and proper input selection.
  5. Best Practices for Sharing or Publishing a Workflow on a Galaxy instance, be it stand-alone or embedded within a Page.


  1. General knowledge of Galaxy, or attendance at the “Introduction to Galaxy” session.
  2. A wi-fi enabled laptop with a modern web browser. Google Chrome, Firefox and Safari will work best.

 Visualisation of NGS Data

This workshop will cover visualisation of both primary NGS analyses –alignments, variants, annotations — as well as downstream options such as heat maps, charts, and graphs.


  1. General knowledge of Galaxy, or attendance at the “Introduction to Galaxy” session.
  2. A wi-fi enabled laptop with a modern web browser. Google Chrome, Firefox and Safari will work best.

 Metagenomic Analysis with Galaxy

Have a metagenomic sample and want to analyze it. This hands-on workshop will demonstrate how to use Galaxy to perform analysis of metagenomic samples. Participants will perform the analyses themselves on the provided instance of Galaxy. The covered tools will include Mothur for the annotation of 16S community profiling datasets, tools for annotating WGS data from Huttenhower Lab, MGTAXAtools, and R tools for the differential abundance analysis of metagenomic datasets.


  1. General knowledge of Galaxy, or attendance at the “Introduction to Galaxy” session.
  2. Background knowledge of metagenomic sequencing goals, approaches and analysis techniques (amounts to reading a couple of review papers that will be recommended in advance of the conference)
  3. Basic knowledge of statistics (for deeper understanding of some of the tools)

Mass Spectrometry-based Proteomics Data Analysis using Galaxy-P

This hands-on workshop will take participants through the essential steps for using Galaxy for the analysis of mass spectrometry (MS)-based proteomics data, focusing protein identification from large-scale datasets. After a short introduction on the basics of MS-based proteomics data types and concepts that underly protein identification from this data, the workshop will be organized around three integrated modules, presented in this order:

  1. Basic proteomic workflows for protein identification
    • Attendees will be taken on a tour of MS-based proteomics tools available in the Tool Shed; using some of these tools, attendees will learn methods for protein sequence database construction and manipulation, available Galaxy-based tools for sequence database searching, outputted data types and tools for collating results
  2. Advanced proteomic workflows
    • Building on knowledge gained in module 1, attendees will learn about advanced applications in protein identification, focusing on applications that integrate genomic/transcriptomic data with proteomics data. Attendees will learn methods to construct protein databases from RNA-seq data, and downstream tools designed to evaluate the quality of protein identifications matching to genomic/transcriptomic-derived protein sequences.
  3. Visualization and interpretation of results
    • Attendees will gain exposure to the mechanics of visualization in Galaxy, a variety of tools in place for visualizing outputted protein identifications from upstream workflows. These include tools for data quality control. Visualization tools for interpreting results from proteogenomics applications, via mapping of identified peptides to reference genomes, will also be demonstrated.

At the end of the workshop, attendees will have working knowledge of MS-based proteomics tools in the Tool Shed, experience in setting up basic workflows for protein identification, as well as more advanced applications in proteogenomics. An understanding of available tools for results visualization and interpretation will also be gained. Participants will be given temporary accounts to local Galaxy instance at the University of Minnesota to participate in hands-on workshop activities.


  1. General knowledge of Galaxy, or attendance at the “Introduction to Galaxy” session.

Scripting Galaxy using the API and BioBlend

Galaxy has a growing API that allows for external programs to upload and download data, manage histories and datasets, run tools and workflows, and even perform admin tasks. This session will cover programmatic access of the API either by direct REST web requests or by using the BioBlend Python library.


  1. Basic understanding of Galaxy from a developer point of view.
  2. Python programming.

Setting up a Galaxy instance as a service

The premise: You are given the task to set up a Galaxy instance for others (i.e. as a core service in your institute) and you are not really familiar with Galaxy.

In this workshop, you will learn what is important when you set up a Galaxy server from scratch, what are the pitfalls you might run into, how to interact with the potential users of the service you gonna offer, and how to make sure, the Galaxy instance you have set up is really used in the end. After a general introduction, several Galaxy installations are presented. The session will finish with a panel discussion, where we intend to discuss questions from the workshop participants.


  1. Basic knowledge of the Unix/Linux command line interface
  2. Familiar with the Bioinformatics problems (and their solutions) that wet lab scientists run into.

Galaxy Interactive Environments

In this session you will get an introduction to Interactive Environments (IE) as an easy and powerful way to integrate arbitrary interactive web services into Galaxy. We will demonstrate the IPython Galaxy project and the general concept of IE’s. Moreover, we will create an IE on-the-fly to get you started!


  1. Basic understanding of Galaxy from a developer point of view.

Running Galaxy on Docker and StarCluster

Two different methods of running Galaxy would be covered

  1. As a Docker container : here we will cover the fundamentals of Docker containers and why would you want to use them for running your pipeline. After the overview we will have a hands on session of running Docker Galaxy image and running the deepTools pipeline
  2. Managing Galaxy using Starcluster : The STAR (Software Tools for Academics and Researchers) program at MIT provides a command-line tool called StarCluster. This tool has a number of sub commands, which can be used to create, manage, login to, stop, and destroy clusters of one or more VM instances on EC2. Although StarCluster does not natively support Galaxy (yet), it provides convenient command tool chain to manage EC2 AMI (which could be the CloudMan instances running Galaxy servers). The real utility of StarCluster comes when doing development on Galaxy ToolShed, whose workflow we will demonstrate as part of the hands on.


  1. Python
  2. Linux Shell Scripting

Introduction to Writing Galaxy Tools and Publishing in Galaxy ToolShed

This tutorial will teach developers and bioinformaticians how to take a working script or application and turn it into a Galaxy tool. It will cover the basics of wrapping, common parameters, tool linting, best practices, loading tools into Galaxy, add citations, and publishing tools to the Github and Galaxy Tool Shed. Common tips and tricks will be discussed as well as insights from some of the best tool developers out there.


  1. General knowledge of Galaxy, or attendance at the “Introduction to Galaxy” session.
  2. Familiarity with Unix command line and text editors

Test-Driven Development of Galaxy Tools with Planemo & Advanced Topics in Tool Creation

This tutorial is aimed at people with some experience developing tools and will cover more advanced topics in tool development, more complex tools, and recent enhancements to the Galaxy tool development process including:

  • Using Planemo, a new command-line application to aid Galaxy tool development, to develop Galaxy tools using a test driven development methodology.
  • Designing tools for use with the dataset collections.
  • Publishing complex tools to the Galaxy Tool Shed.
  • Maintaining Galaxy Tools.


  1. Basic Knowledge of Galaxy Tools, or attendance at the Introduction to Writing Galaxy Tools and Publishing in Galaxy Tool Shedsession.

The Galaxy Database Schema

Running a production Galaxy server, you some times end up in with a situation, where you manually need to interact with the database. e.g. you need to change the state of a job to ‘error’. This is always a very risky adventure. Or a not-at-all risky situation: you want to extract usage information, which can not be gathered using the given report tools. For both cases, you need a good understanding of the Galaxy database schema.

Learn some of the design concepts of the database, which parts of the schema are stable, and which will be changing in the foreseeable future.


  1. Experience maintaining a production Galaxy server (recommended)
  2. Basic knowledge of relational databases and SQL statements

Galaxy Architecture

Want to know the big picture about what is going on inside Galaxy? This workshop will introduce participants to the high-level architecture of Galaxy internals, and to the project’s coding practices and standards.


  1. General knowledge of Galaxy, or attendance at the “Introduction to Galaxy” session.
  2. Knowledge of programming or a scripting language.