ENCODE2 Paper Collection

Methodologies in ENCODE 2

 * Chromatin segmentation
 * HMM -- ChromHMM
 * Methodology Article
 * ChromHMM: automating chromatin-state discovery and characterization
 * Application Article
 * Discovery and characterization of chromatin states for systematic annotation of the human genome
 * Mapping and analysis of chromatin state dynamics in nine human cell types
 * DBN -- SegWay
 * Methodology Article
 * Unsupervised pattern discovery in human chromatin structure through genomic segmentation
 * Application Article
 * Integrative annotation of chromatin elements from ENCODE data
 * Self-organizing Map -- SOM
 * Methodology Article
 * Section H in main paper supplement Online resource
 * Section 6.2 in DNAse paper supplement
 * Application Article
 * Main paper An integrated encyclopedia of DNA elements in the human genome
 * The accessible chromatin landscape of the human genome
 * Regression
 * Support-vector Regression -- SVR


 * Network
 * Architecture of the human regulatory network derived from ENCODE data.
 * Circuitry and dynamics of human transcription factor regulatory networks.
 * Superfamilies of evolved and designed networks.
 * Systematic Localization of Common Disease-Associated Variation in Regulatory DNA.


 * Motif Discovery and related
 * An expansive human regulatory lexicon encoded in transcription factor footprints.
 * Annotation of functional variation in personal genomes using RegulomeDB.
 * CENTIPEDE: Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data.
 * Analysis of variation at transcription factor binding sites in Drosophila and humans.
 * Functional analysis of transcription factor binding sites in human promoters.
 * Predicting cell-type–specific gene expression from regions of open chromatin.
 * STAMP: A web tool for exploring DNA-binding motif similarities.
 *  http://en.wikipedia.org/wiki/Multiple_EM_for_Motif_Elicitation
 * A highly integrated and complex PPARGC1A transcription factor binding network in HepG2 cells.
 * 
 *  Quantifying similarity between motifs.
 *  Predicting transcription factor binding sites using local over-representation and comparative genomics.
 *  Predicting transcription factor binding sites using local over-representation and comparative genomics.
 * Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors.
 * MEME-ChIP: Motif analysis of large DNA datasets.
 * 
 *  MEME SUITE: Tools for motif discovery and searching.
 * Ubiquitous heterogeneity and asymmetry of the chromatin environment at regulatory elements.
 * CAGT: ENCODE portal for Clustered Aggregation plots of functional marks at regulatory elements.
 * Spark: A navigational paradigm for genomic data exploration.
 *  http://motif.bmi.ohio-state.edu/ChIPMotifs/
 * Uncovering transcription factor modules using one-dimensional and three-dimensional analyses.
 * CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling.
 * A computational genomics approach to identify cis-regulatory modules from chromatin immunoprecipitation microarray data–a case study using E2F1.
 *  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities.
 * The Hypersensitive Glucocorticoid Response Specifically Regulates Period 1 and Expression of Circadian Genes.
 * BioOptimizer: a Bayesian scoring function approach to motif discovery.
 * 
 * Mapping and quantifying mammalian transcriptomes by RNA-Seq.
 * Circuitry and dynamics of human transcription factor regulatory networks.
 * 


 * Predict genes and the expression levels of genes
 * Modeling gene expression using chromatin features in various cellular contexts.
 * Combining RT-PCR-seq and RNA-seq to catalog all genic elements encoded in the human genome.
 * GENCODE: The reference human genome annotation for the ENCODE project.
 * Long noncoding RNAs are rarely translated in two human cell lines.
 *  http://sammeth.net/confluence/display/FLUX/1+-+Introduction
 * Widespread plasticity in CTCF occupancy linked to DNA methylation.
 *  Differential expression analysis for sequence count data.
 * An Atlas of the Epstein-Barr Virus Transcriptome and Epigenome Reveals Host-Virus Regulatory Interactions.
 * 
 * 
 * Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages.
 *  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.


 * Proteogenomic mapping
 * Long noncoding RNAs are rarely translated in two human cell lines.
 * <Peppy> http://geneffects.com/Peppy
 * <Genome Fingerprint Scanning> http://en.wikipedia.org/wiki/Genome-based_peptide_fingerprint_scanning
 * Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function.
 * Qscore: an algorithm for evaluating SEQUEST database search results. J Am Soc Mass Spectrom.
 * TANDEM: matching proteins with tandem mass spectra. Bioinformatics.


 * Random Forest
 * Modeling gene expression using chromatin features in various cellular contexts.
 * Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription related factors.
 * Long noncoding RNAs are rarely translated in two human cell lines
 * <RuleFit3 package> http://statweb.stanford.edu/~jhf/r-rulefit/rulefit3/R_RuleFit3.html


 * Support vector classifiers
 * Sequence and chromatin determinants of cell-type–specific transcription factor binding.


 * Fast Fourier transform
 * Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors.


 * Sparse logistic regression classifier
 * Predicting cell-type–specific gene expression from regions of open chromatin.


 * Poisson regression
 * The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression.
 * Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages.
 * Mixture Poisson Regression Model


 * Probabilistic modeling
 * Annotation of functional variation in personal genomes using RegulomeDB.
 * CENTIPEDE: Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data.
 * Uncovering transcription factor modules using one-dimensional and three-dimensional analyses.
 * Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture.


 * Multidimensional Scaling Approach (MDS) and Principal Components Analysis (PCA)
 * The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. <MDS>
 * Widespread plasticity in CTCF occupancy linked to DNA methylation. <PCA>


 * Clustering
 * Widespread plasticity in CTCF occupancy linked to DNA methylation.
 * Pvclust: An R package for assessing the uncertainty in hierarchical clustering.
 * Circuitry and dynamics of human transcription factor regulatory networks.
 * <Ward clustering> Hierarchical Grouping to Optimize an Objective Function.


 * Gene set analysis
 * Personal and population genomics of human regulatory variation.
 * <WebGestalt> http://bioinfo.vanderbilt.edu/webgestalt/
 * RNA editing in the human ENCODE RNA-seq data.
 * ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data.
 * <GREAT> GREAT improves functional interpretation of cis-regulatory regions.


 * Tools for align
 * Predicting cell-type–specific gene expression from regions of open chromatin.
 * F-seq Fast and accurate short read alignment with Burrows-Wheeler transform.
 * The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression.
 * <T-Coffee> http://tcoffee.vital-it.ch/apps/tcoffee/index.html
 * Ubiquitous heterogeneity and asymmetry of the chromatin environment at regulatory elements.
 * <DNAnexus probabilistic mapper> https://dnanexus.com/
 * Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages.
 * <BWA> Fast and accurate short read alignment with Burrows-Wheeler transform.
 * The Hypersensitive Glucocorticoid Response Specifically Regulates Period 1 and Expression of Circadian Genes.
 * <Bowtie>


 * Peak calling
 * Sequence and chromatin determinants of cell-type–specific transcription factor binding.
 * <SPP> Design and analysis of ChIP-seq experiments for DNA-binding proteins.
 * Widespread plasticity in CTCF occupancy linked to DNA methylation.
 * <SPP>
 * <Hotspot> Chromatin accessibility pre-determines glucocorticoid receptor binding patterns.
 * ChIP-seq guidelines and practices used by the ENCODE and modENCODE consortia.
 * <PeakSeq> PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.
 * <MACs> Model-based analysis of ChIP-Seq (MACS).
 * A highly integrated and complex PPARGC1A transcription factor binding network in HepG2 cells.
 * <PeakSeq> PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.
 * Uncovering transcription factor modules using one-dimensional and three-dimensional analyses.
 * <MACS>
 * <PeakSeq>
 * <QuEST>
 * site identification from short sequence reads (SISSRs)
 * <Sole-Search>
 * W-ChIPeaks: a comprehensive web application tool for processing ChIP-chip and ChIP-seq data.
 * Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages.
 * W-ChIPeaks
 * <BALM> High Resolution Detection and Analysis of CpG Dinucleotides Methylation Using MBD-Seq Technology.
 * The Hypersensitive Glucocorticoid Response Specifically Regulates Period 1 and Expression of Circadian Genes.
 * <QuEST> Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data.
 * Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions.
 * <SPP>


 * Microarray related
 * A highly integrated and complex PPARGC1A transcription factor binding network in HepG2 cells.
 * Predicting transcription factor binding sites using local over-representation and comparative genomics.


 * GWASs and Epigenetics
 * Genome-wide studies of CTCF and cohesin provide insight into chromatin structure and regulation.
 * Genome-wide Epigenetic Data Facilitate Understanding of Disease Susceptibility Association Studies.
 * Systematic Localization of Common Disease-Associated Variation in Regulatory DNA.


 * Data Exploration and Visualization Tools
 * Spark: A navigational paradigm for genomic data exploration.
 * Integrative annotation of chromatin elements from ENCODE data.
 * Cistrome: an integrative platform for transcriptional regulation studies.
 * Uncovering transcription factor modules using one-dimensional and three-dimensional analyses.
 * Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages.
 * Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages.
 * Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions.
 * <Reactome> http://www.reactome.org/

Papers in Nature
An integrated encyclopedia of DNA elements in the human genome The ENCODE Project Consortium. Nature (6 September 2012) html pdf

Landscape of transcription in human cells Djebali, S., Davis, C.A. et al. Nature (6 September 2012) html pdf

The accessible chromatin landscape of the human genome Thurman, R.E., Rynes, E., Humbert, R. et al. Nature (6 September 2012) html pdf

An expansive human regulatory lexicon encoded in transcription factor footprints Neph, S., Vierstra, J., Stergachis, A.B., Reynolds, A.P. et al. Nature (6 September 2012) html pdf

Architecture of the human regulatory network derived from ENCODE data Gerstein, M.B., Kundaje, A., Hariharan, M., Landt, S.G., Yan, K.K. et al. Nature (6 September 2012) html pdf

The long-range interaction landscape of gene promoters Sanyal, A., Lajoie, B.R. et al. Nature (6 September 2012) html pdf

Companion Papers
Analysis of variation at transcription factor binding sites in Drosophila and humans Spivakov, M. et al. Genome Biol. (6 September 2012) html pdf

Annotation of functional variation in personal genomes using RegulomeDB. Boyle, A.P. et al. Genome Res. (6 September 2012) html pdf

Cell type-specific binding patterns reveal that TCF7L2 can be tethered to the genome by association with GATA3 Frietze, S. et al. Genome Biol. (6 September 2012) html pdf

ChIP-seq guidelines and practices used by the ENCODE and modENCODE consortia. Landt, S.G. et al. Genome Res. (6 September 2012) html pdf

Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription related factors Yip, K.Y. et al. Genome Biol. (6 September 2012) html pdf

Combining RT-PCR-seq and RNA-seq to catalog all genic elements encoded in the human genome Howald, C. et al. Genome Res. (6 September 2012) html pdf

Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs Tilgner, H. et al. Genome Res. (6 September 2012) html pdf

Discovery of hundreds of mirtrons in mouse and human small RNA data Ladewig, E. et al. Genome Res. (6 September 2012) html pdf

Functional analysis of transcription factor binding sites in human promoters Whitfield, T.W. et al. Genome Biol. (6 September 2012) html pdf

GENCODE: The reference human genome annotation for the ENCODE project Harrow, J. et al. Genome Res. (6 September 2012) html pdf

Linking disease associations with regulatory information in the human genome. Schaub, M.A. et al. Genome Res. (6 September 2012) html pdf

Long noncoding RNAs are rarely translated in two human cell lines Bánfai, B. et al. Genome Res. (6 September 2012) html pdf

Modeling gene expression using chromatin features in various cellular contexts Dong, X. et al. Genome Biol. (6 September 2012) html pdf

Personal and population genomics of human regulatory variation. Vernot, B. et al. Genome Res. (6 September 2012) html pdf

Predicting cell-type–specific gene expression from regions of open chromatin. Natarajan, A. et al. Genome Res. (6 September 2012) html pdf

RNA editing in the human ENCODE RNA-seq data Park, E. et al. Genome Res. (6 September 2012) html pdf

Sequence and chromatin determinants of cell-type–specific transcription factor binding. Arvey, A. et al. Genome Res. (6 September 2012) html pdf

Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors Wang, J. et al. Genome Res. (6 September 2012) html pdf

Simultaneous SNP identification and assessment of allele-specific bias from ChIP-seq data Ni et al. BMC Gen. (6 September 2012) html pdf

The GENCODE pseudogene resource Pei, B. et al. Genome Biol. (6 September 2012) html pdf

The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Derrien, T. et al. Genome Res. Year Published: (6 September 2012) html pdf

Ubiquitous heterogeneity and asymmetry of the chromatin environment at regulatory elements. Kundaje, A. et al. Genome Res. (6 September 2012) html pdf

Understanding transcriptional regulation by integrative analysis of transcription factor binding data. Cheng, C. et al. Genome Res. (6 September 2012) html pdf

Widespread plasticity in CTCF occupancy linked to DNA methylation Wang, H. et al. Genome Res. (6 September 2012) html pdf

A highly integrated and complex PPARGC1A transcription factor binding network in HepG2 cells. Charos, A.E. et al. Genome Res. (6 September 2012) html pdf

Additional Research Papers
Spark: A navigational paradigm for genomic data exploration Nielsen, C.B. et al. Genome Res. 22, 2262-2269 (2012) html pdf

Integrative annotation of chromatin elements from ENCODE data Hoffman, M.M. et al. Nucleic Acids Res. 41 (2), 827-841 (2012) html pdf

Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function Ezkurdia, I. et al. Mol. Biol. Evol. 29, 2265-2283 (2012) html pdf

An Atlas of the Epstein-Barr Virus Transcriptome and Epigenome Reveals Host-Virus Regulatory Interactions Arvey, A. et al. Cell Host Microbe 12, 233-245 (2012) html pdf

The Chromatin Fingerprint of Gene Enhancer Elements Zentner, G.E. & Scacheri, P.C. J. Biol. Chem. 287, 30888-30896 (2012) html pdf

Genome-wide studies of CTCF and cohesin provide insight into chromatin structure and regulation Lee, B.K. & Iyer, V.R. J. Biol. Chem. 287, 30906-30913 (2012) html pdf

Transcription factor mediated epigenetic reprogramming Sindu, C. et al. J. Biol. Chem. 287, 30922-30931 (2012) html pdf

SWI/SNF Chromatin Remodeling Factors: Multiscale Analyses and Diverse Functions Euskirchen, G. et al. J. Biol. Chem. 287, 30897-30905 (2012) html pdf

Uncovering transcription factor modules using one-dimensional and three-dimensional analyses Lan, X. et al. J. Biol. Chem. 287, 30914-30921 (2012) html pdf

Thematic Minireview Series on Results from the ENCODE Project: Integrative Global Analyses of Regulatory Regions in the Human Genome Farnham, P.J. J. Biol. Chem. 287, 30885-30887 (2012) html pdf

Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages Lan, X. et al. Nucleic Acids Res. 40, 7690-7704 (2012) html pdf

Genome-wide Epigenetic Data Facilitate Understanding of Disease Susceptibility Association Studies Hardison, R.C. J. Biol. Chem. 287, 30932-30940 (2012) html pdf

The Hypersensitive Glucocorticoid Response Specifically Regulates Period 1 and Expression of Circadian Genes Reddy, T.E. et al. Mol. Cell. Biol. 32, 3756-3767 (2012) html pdf

Circuitry and dynamics of human transcription factor regulatory networks Neph, S. et al. Cell 150, 1274–1286 (2012) html pdf

Evidence of abundant purifying selection in humans for recently acquired regulatory functions. Ward, L.D. & Kellis, M. Science 337, 1675-1678 (2012) html pdf

Systematic localization of common disease-associated variation in regulatory DNA. Maurano, M.T. et al. Science 337, 1190-1195 (2012) html pdf