Segway semi-automated genomic annotation

Hoffman MM, Buske OJ, Wang J, Weng Z, Bilmes J, Noble WS. 2012. Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat Methods 9:473–476. doi:10.1038/nmeth.1937. PubMed Central (free version): PMC3340533 (BibTeX)
Hoffman MM*, Ernst J*, Steven WP, Kundaje A, Harris RS, Libbrecht M, Giardine B, Ellenbogen PM, Bilmes JA, Birney E, Hardison RC, Dunham I, Kellis M, Noble WS. 2012. Integrative annotation of chromatin elements from ENCODE data. NAR 41:827-841 doi: (BibTeX)

The free Segway software package contains a novel method for analyzing multiple tracks of functional genomics data. Our method uses a dynamic Bayesian network (DBN) model, which enables it to analyze the entire genome at 1-bp resolution even in the face of heterogeneous patterns of missing data. This method is the first application of DBN techniques to genome-scale data and the first genomic segmentation method designed for use with the maximum resolution data available from ChIP-seq experiments without downsampling. Our software has extensive documentation and was designed from the outset with external users in mind. Researchers at other universities and institutes have already installed and used Segway for their own projects.

Segmentations

Human chromatin structure

View the segmentation from our Nature Methods paper, "Unsupervised pattern discovery in human chromatin structure through genomic segmentation," in the UCSC Genome Browser or in Ensembl.

UCSC Genome Browser

The Ensembl Regulatory Build for GRCh38 (hg38) can be loaded through the Track Data Hub interface. "Ensembl Regulatory Build" is listed in the Public Hubs directory. Once loading the track hub, you can show the "Cell Type Segmentations" supertrack which contains a Segway track for each of 18 cell types.

For older assemblies you can load, they can be browsed below:

Here is a brief description of the various classes of segment labels.

Download the segmentation for further analysis. NCBI36 (hg18). GRCh37 (hg19). (~165 MB, gzipped BED). Here are the mnemonic assignments (tab-delimited).

Ensembl

The segmentation can be displayed by clicking the "Configure this page" option on the left navigation bar. The segmentations for each cell line can be selected under "Regulatory Features" and under the heading of "Enable/disable all Segmentation features". As an example you can try viewing the segmentations for BRCA2 in hg38.

For more details and instructions see the description of Regulatory Segmentation.

Integrative annotation of chromatin elements

View the segmentation from our Nucleic Acids Research paper, "Integrative annotation of chromatin elements from ENCODE data," in the UCSC Genome Browser: hg19 only. These segmentations are already relabeled so it is not necessary to use a mnemonic assignment file.

Segmentation downloads (hg19)

Documentation

Read the documentation, which begins with a quick start. The documentation is also available as a PDF.

Installation

For installation instructions, see the "Quick Start" section of the documentation here. For more detailed installation instructions read the guide here.

Segway requires the use of a cluster management system. Currently, we support Sun Grid Engine/Oracle Grid Engine/Open Grid Scheduler and Platform LSF. If you would like to use Segway on another system, please open a ticket in the issue tracker. You can also run Segway on SGE via the Amazon EC2 compute cloud.

Segway is only supported on Linux. Specifically, this means it is not supported on other operating systems such as Mac OS X.

Support

For support of Segway, please write to the segway-users mailing list, rather than writing the authors directly. Using the mailing list will get your question answered more quickly. It also allows us to pool knowledge and reduce getting the same inquiries over and over. Questions sent to the mailing list will receive a higher priority than those sent to us individually.

Specifically, if you want to report a bug or request a feature, please do so using the Segway issue tracker. We are interested in all comments on the package, and the ease of use of installation and documentation.

If you do not want to read discussions about other people's use of Segway, but would like to hear about new releases and other important information, please subscribe to the segway-announce mailing list. Announcements of this nature are sent to both segway-users and segway-announce.

Useful links

Running Segway in the Amazon Compute Cloud by Jay Hesselberth, University of Colorado Denver

Source code

Version 1.2

Notes on the segmentation

The underlying signal data for the segmentation presented above is available in bedGraph and bigWig formats (NCBI36/hg18). Use this browser file to load all the bigWigs. We produced these signal files using Wiggler from original data available from the Encode DCC.

We produced the original segmentations for NCBI36. We used liftOver (minMatch=0.99) to convert segmentations to GRCh37, and then filtered out any overlapping regions.

  segway-users mailing list