Input File

Output Result

Example Run

Input File
The PDB file must contain representative frames from dynamics trajectory of the protein of your interest. At least 11 such frames should be present in the file organized in PDB format for NMR structure, where each frame is denoted as a separate MODEL. Only the Cα atom position coordinates are to be included for each frame in the input file.

The RMSF file must contain RMSF value of each Cα atomic position. It should be calculated from 1μs long CGMM trajectory. If the user does not provide it, the server will calculate RMSF file using 11 frames from the uploaded input file.
Creating the input file from protein of your interest
Step 1
Simulate 1 μs long trajectory using CGMM forcefield. Please click the link to know how to locally run simulations and generate the molecular dynamics trajectory using CGMM forcefield. Required files obtained from this step: .gro and .trr.
Step 2
Use pdb_CApdb.pl program to convert your initial/original PDB file to PDB file with only Cα atom coordinates.
  • $ perl pdb2_CApdb.pl 1T9F.pdb (Example: 1T9F.pdb )
  • Output of program: 1T9F_ca.pdb file
  • Step 3
    Use make_index.pl programe to create index file of a protein structure file for Gromacs utilities.
  • $ perl make_index.pl 1T9F.gro
  • Output of program: 1T9F.ndx
  • Step 4
    Use 'trjconv' GROMACS utility to create 11 models (the starting, ending and the 100 ns interval frames) from this trajectory.
  • $ trjconv -s 1T9F_ca.pdb -n 1T9F.ndx -f 1t9f_rn.trr -o 1t9f.pdb -skip 1000. You may alter the command to create your desired number of frames; minimum should be 11.
  • Example.pdb in the Home Page has been created the same way.
  • Step 5
    Use input_rmsf_trj.pl file to calculated the input rmsf file from 1μs long CGMM trajectory. Use 'trjconv' to create all frames of trajectory.
  • $ trjconv -s 1T9F_ca.pdb -n 1T9F.ndx -f 1t9f_rn.trr -o 1t9f_md_trj.pdb
  • $ perl input_rmsf_trj.pl 1t9f_md_trj.pdb. It will create 1t9f_md_trj.pdb_rmsf.txt file.
  • Output Result Top
    Upon submission of correct input file and successful execution of the program, the user is directed to the Results Page.The output Results Page shows the residue number range of the dynamics-matched (functionally relevant) region of input protein and the corresponding proteins (and their matched segments) from database. The coarse-grained molecular dynamics trajectories, against which the matches are made, were calculated from 1μs simulations on 5264 non redundant monomeric proteins, using CGMM. The matched proteins denoted by PDB identifiers in the Results Page are linked with the functional annotation page of the Protein Data Bank.
    Work Flow
    Step 1
    The webserver calculates RMSFnorm curve of query protein using all frames if user does not provide RMSF file.
    Step 2
    The program uses RMSFnorm information to identify the matching flexible regions from our coarse-grained dynamics trajectory database.
  • Pairwise match the RMSFnorm segments of query protein and all proteins in database.
  • Prior to matching all RMSFnorm numeric values are converted into Symbols using the mapping criteria below.

  • Exact match segments of RMSFnorm curve between all pair of proteins are found by Smith-Waterman local alignment algorithm

  • Filter the matched pairs to keep only mobile segments.
  • Mobile segment defined by: # of 'L' symbol > 35% or total # of 'G','H','I' symbol > 15% (functional segment satisfy this condition) [blue line in above figure].
  • Step 3
    The autocorrelation vector (ACV) profile of matched flexible regions in the simulation are further compared to check if Pearson correlation coefficient (CC) between two ACV profiles satisfy our dynamics-function match criteria.
  • Autocorrelation vector (ACV):

  • The 3D ACV of a protein is defined using a distance step dx, where the ACV is of dimension n where n=dmax/dx, for dmax representing the distance between two farthest atoms of protein in a given conformation (frame). Each component (i) of the 3D ACV can be represented as:

    where the Pj and Pk are the properties or weights associated with the atoms j and k, separated by a distance (i)dx and (i+1)dx. When the value of P is uniformly taken as 1, it is called unweighted ACV. An ACV can be weighted by some properties defined on atom (e.g., atomic charge, van der Waals radii).
  • ACV profile = 11 ACVs (start, end and 100ns interval frames) for each protein.
  • Find Pearson Correlation coefficient (CC) between ACVs of two protein segment's ACV profile.
  • 121 CC values obtained for 11 x 11 pair of ACV match from two proteins
  • Criteria whether two proteins have same dynamics: 25% of the CC values are >0.95 and at least 50% > 0.90
  • Example showing CC comparison between two functionally similar proteins PDB:1HB8 and 1ST7, and functionall dissimilar PDB:1HB8 and 1K5K:

  • Step 4
    The match results are stored in a temporary Mysql table and the top 10 matches based on the correlation coefficients (CC) are output in the Results Page.
    Reference: Bhadra P. and Pal D. De novo inference of protein function from coarse-grained dynamics. Proteins: Struct. Func. Bioinf. 82:2443-2454 (2014)
    Example Run Top
    Prepare the input file:
    Protein 1d10 (PDB ID: 1T9F) is an unknown function protein from C. elegans organism. The Cα coordinates from the PDB file were taken for 1 μs simulation using CGMM forcefield as outlined above. Subsequently a PDB coordinate file with 11 models (the starting, ending and the 100 ns interval frames) was created from the trajectory (1t9f.pdb, NMR format). RMSF file was created from 1μs simulation trajectory 1t9f_rmsf.txt

    To find the funciton of the above protein we choose 1t9f.pdb as query protein using "Browse" button.

    After clicking the "Submit" Button an Alert Window will open.
  • The webserver will generate a JOB ID and will present the URL of the Results Page.
  • In the Figure below the JOB ID is 1029.
  • The Results page URL is: http://pallab.serc.iisc.ernet.in/dynfunc/result/1029/Result.php.
  • The browser window can be closed after saving the URL information for later use.
  • The Results are currently saved for seven days.
  • After clicking on "OK" on the Alert window the pop up will be closed.
  • The URL information can be saved if not saved before. Other information can also be noted.
  • The page is periodically refreshed until the job is completed and Results displayed.
  • The Results Page
    The output Results Page shows the residue number range of the dynamics-matched (functionally relevant) region of input protein and corresponding proteins from the database. PDB ID of the matched proteins in the Results Page is linked to the functional annotation page of the Protein Data Bank.
    Protein 1d10 (PDB ID: 1T9F) is an unknown function protein. The DynFunc server shows the top match (Rank=1) with 41% and 79% of CC values above 0.95 and 0.9, respectively, between ACV profiles calculated from residues 103-111 from query protein (1T9F) and matched protein residues 39-47, chain A, PDB:1EJ2. This indicates that the dynamics of residues 103 to 111 of 1T9F is closely matched with the dynamics of residues 39 to 47 of Nicotinamide mononucleotide adenylyltransferase protein, PDB ID: 1EJ2. Nicotinamide mononucleotide adenylyltransferase (EC catalyzes the synthesis of nicotinamide adenine dinucleotide (NAD+) or nicotinic acid dinucleotide (NaAD+) from nicotinamide mononucleotide (NMN+) or nicotinic acid mononucleotide (NaMN+), respectively, by transferring the adenylyl part of ATP and concomitantly releasing pyrophosphate (PPi)204. The region (residues 39 to 47) of 1EJ2 is part of the the active site of the protein for nicotinamide-nucleotide adenylytrasferase activity. In protein 1T9F, residues 108 to 111 are surface exposed, therefore, it can be active site region. If we look at the relative juxtaposition of the catalytic residues (which are also conserved) in 1EJ2, they match closely with 1TF9 (see Figure below). On basis of dynamics-match, protein 1d10 (PDB ID: 1T9F) may be inferred to potentially have transferase activity. Whether, it will act on NMn+/NaMn+ as substrate has to be further investigated along with other potential substrates using a docking search. The CombFunc protein functional annotation server (based on SVM), appears to concur with our prediction.