A Journal Devoted To All Areas Of Applied Statistics
 
 
  Annals of Applied Statistics
  Submissions
  Subscriptions
  Editorial Board
  Next Issues
  Published Issues
Supplements
  Instructions for Referees
  Letters to Editor
 
Replication data for: Coupling of Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species
hdl:1902.1/10647
Version: 1 – Released: Wed Nov 28 00:00:00 EST 2007
Cataloging Information
Documentation, Data and Analysis
User Comments
Versions
 
If you use these data, please add the following citation to your scholarly references. Why cite?
Original Publication
Results found in this publication can be replicated using these data.
Qing Zhou, and Wing Hung Wong. 2007. "Coupling of Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species." Ann. Appl. Statist. Volume 1, Number 1 (2007), 36-65. article available here
Data Citation Details
Study Global IDhdl:1902.1/10647
AuthorsQing Zhou (UCLA); Wing Hung Wong (Stanford University)
Production Date2007
DistributorInstitute for Mathematical Statistics Logo
Distribution Date2007
Deposit DateOctober 01, 2007
Provenance
Abstract and Scope
Abstract

Cis-regulatory modules (CRMs) composed of multiple transcription factor binding sites (TFBSs) control gene expression in eukaryotic genomes. Comparative genomic studies have shown that these regulatory elements are more conserved across species due to evolutionary constraints. We propose a statistical method to combine module structure and cross-species orthology in de novo motif discovery. We use a hidden Markov model (HMM) to capture the module structure in each species and couple these HMMs through multiple-species alignment. Evolutionary models are incorporated to consider correlated structures among aligned sequence positions across different species. Based on our model, we develop a Markov chain Monte Carlo approach, MultiModule, to discover CRMs and their component motifs simultaneously in groups of orthologous sequences from multiple species. Our method is tested on both simulated and biological data sets in mammals and Drosophila, where significant improvement over other motif and module discovery methods is observed.

KeywordsCis-regulatory module; motif discovery; comparative genomics; coupled hidden Markov model; Markov chain Monte Carlo; dynamic programming
Data Availability
Number of Files 1
Terms of Use
Dataverse Network Terms of Use
View Terms of Use [+]
IQSS Dataverse Network Terms and Conditions

By downloading these Materials, I agree to the following:

  1. I will not use the Materials to
    1. obtain information that could directly or indirectly identify subjects.
    2. produce links among the Distributor's datasets or among the Distributor's data and other datasets that could identify individuals or organizations.
    3. obtain information about, or further contact with, subjects known to me except where the use and/or release of such identifying information has no potential for constituting an unwarranted invasion of privacy and/or breach of confidentiality.
  2. I agree not to download any Materials where prohibited by applicable law.
  3. I agree not to use the Materials in any way prohibited by applicable law.
  4. I agree that any books, articles, conference papers, theses, dissertations, reports, or other publications that I create which employ data reference the bibliographic citation accompanying this data. These citations include the data authors, data identifier, and other information accord with the Recommended Standard (http://thedata.org/citation/standard) for social science data.
  5. THE DISTRIBUTOR MAKES NO WARRANTIES, EXPRESS OR IMPLIED, BY OPERATION OF LAW OR OTHERWISE, REGARDING OR RELATING TO THE DATASET
Other Information
NotesDATAPASS:TERMS:STANDARD:1.0 (STANDARD DEPOSIT TERMS 1.0) This study was deposited under the of the Data-PASS standard deposit terms. A copy of the usage agreement is included in the file section of this study.

"Replication data for: Coupling of Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species", hdl:1902.1/10647