Dynameomics

Dynameomics

Dynameomics is a continuing project in the Daggett group to characterize the native state dynamics and the folding / unfolding pathway of representatives from all known protein folds by molecular dynamics simulation.

This effort began with the creation of a consensus fold list. This was done by cross-referencing the fold definitions used in SCOP, CATH, and the Dali Domain Dictionary as described in the Origin of the Fold List page. Next, targets were selected from the consensus fold list. A target refers to a specific protein structure from the PDB that has been chosen to represent a given fold (see the example on the left). The specifics of this choice are give on the Target Selection page. The complete list of consensus folds, their populations and targets are provided in the fold and target pages.

At this time, we are continuing to simulate targets from the fold list, generally in order of decreasing fold population. The simulation protocols, software, and analyses are described on the methods page.

As of March 2010, we have performed nearly 11,000 simulations of over 2000 proteins for a combined simulation time of more than 340 microseconds. This site currently contains the native simulations for our top 100 targets.

    • Target Statistics

      Target Statistics

      The Fold List page contains a description of how we generated our consensus domain dictionary (CDD) from the SCOP, CATH, and Dali domain dictionaries. We describe both the methods for generating the dictionary and the rationale behind it's generation. We also include summary statistics of the CDD and it's inputs.

    • SNPs

      Single Nucleotide Polymorphism (SNPs)

      Single Nucleotide Polymorphisms (SNPs) are mutations that involve a single nucleotide mutation in an organism's genetic sequence. In humans, they occur approximately once every 300 nucleotide bases, making them our most common source of genetic variation. When these mutations result in an amino acid change in the transcribed protein, they can be responsible for the protein's altered function. SNPs, and their associated change in protein function, are known to cause a variety of human diseases ranging from Parkinson’s disease to epilepsy. Mutations in tumor protein 53 (p53) are correlated with approximately 50% of all cancers.

      The Dynameomics database contains simulations of 31 different proteins, including wildtype and mutant simulations. In all, we have over 200 single-point mutations simulations available for analysis.

    • Fold List

      Methods

      The Methods pages include a description of the criteria we use to select protein targets for simulation from our Consensus Domain Dictionary. We provide an overview of our simulation software and protocols. We describe our quality control efforts to asses the stability of proteins in their native state simulation and give the details of our standard analysis suite.

    • Fold List

      Database

      The Dynameomics data warehouse is based on SQL Server 2005 and 2008 and contains over 55TB of MD simulation and analysis data. These data are distributed across 7 primary servers that range in size from 7.5 TB to 25 TB of storage. Native state simulation data for the top 100 Dynameomics targets, SNP targets, and the GGXGG Peptides can be browsed using this site. You can also request  a SQL Login to run your own queries, and a web services interface is being developed to allow direct access using SOAP.

    • Fold List

      Data Mining

      The Data mining section describes in detail how a number of large scale data analyses were conducted on the database. These analyses include flexibilty calculations, wavelet calculations, transition state mining and reaction coordinate calculations. Examples code (SQL and Mathematica) of how to conduct these analyses are also given in this section.