Overview

DNA sequence analysis generates large volumes of data that present challenging bioinformatic and statistical problems. This tutorial introduces established and new Bioconductor packages and workflows for analyzing sequence data. The Bioconductor project is a widely used collection of nearly 1000 R packages for high-throughput genomic analysis. Approaches for efficiently manipulating sequences and alignments and other common work flows will be covered along with the unique statistical challenges associated with 'RNAseq', variant annotation and other experiments. The emphasis is on exploratory analysis, and the analysis of designed experiments. The workshop will touch on the Biostrings, ShortRead, GenomicRanges, DESeq2, VariantAnnotation, and other packages, with short exercises to illustrate the functionality of each package.

Goals

1. Gain overall familiarity with Bioconductor packages for high-throughput sequence analysis, including Bioconductor vignettes and classes.
2. Obtain experience running bioinformatic workflows for data quality assessment, RNA-seq differential expression, and manipulating variant call format files.
3. Appreciate the importance of ranges and range-based manipulation for modern genomic analysis.
4. Learn 'best practices' for working with large data.

Outline

Prerequisites

The workshop assumes an intermediate level of familiarity with R, and basic understanding of biological and technological aspects of high-throughput sequence analysis. Participants should come prepared with a modern wireless-enabled laptop and web browser installed.

Intended Audience

This workshop is for professional bioinformaticians and statisticians intending to use R/Bioconductor for analysis and comprehension of high-throughput sequence data.

Reference

Huber et al. (2015) Orchestrating high-throughput genomic analysis with Bioconductor. Nature Methods. Jan 29;12(2):115-21.

Program

 9:35am - 10:00am  Registration to enter building
10:00am - 10:15am  Welcome (Course Overview & AMI set-up)
10:20am - 11:00am  Packages and classes
11:05am - 12:00pm  Exploring sequences and alignments
12:00pm -  1:00pm  Lunch, Divercity (15 mins walk from AIST)
 1:00pm -  2:00pm  RNA-seq: a high-level tour
 2:05pm -  3:00pm  Annotating variants
 3:00pm -  3:30pm  Coffee break, Espresso Americano (Telecom Center, 5 mins walk from AIST)
 3:30pm -  4:00pm  Bring your own data
 4:05pm -  4:15pm  Feedback

Speaker Biography

Dr. Morgan is current leader of the Bioconductor software project for the analysis of high-throughput genomic data. Diverse topics addressed by Dr. Morgan's group include the design and analysis of sequence and microarray experiments, quality metrics of sequence-related data, efficient representation and manipulation of sequence data, and forward-looking approaches to representation, analysis, and use of whole-genome sequences.

Registration

Is handled as part of the GIW/InCoB 2015 registration process (https://perdana.apbionet.org/giw-incob-2015/).
Alternatively email gustin.s AT wehi.edu.au if you will be attending.