Wiki source code of Distance-Fluctuations
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | (% class="jumbotron" %) | ||
| 2 | ((( | ||
| 3 | (% class="container" %) | ||
| 4 | ((( | ||
| 5 | = Distance Fluctuation (DF) Analysis = | ||
| 6 | |||
| 7 | Giorgio Colombo Group (UNIPV) | ||
| 8 | ))) | ||
| 9 | ))) | ||
| 10 | |||
| 11 | (% class="row" %) | ||
| 12 | ((( | ||
| 13 | (% class="col-xs-12 col-sm-8" %) | ||
| 14 | ((( | ||
| 15 | = What DF matrices are? = | ||
| 16 | |||
| 17 | The analysis of the results of a MD simulation can performed using the Distance Fluctuation matrices (DF), based on the Coordination Propensity hypothesis: | ||
| 18 | |||
| 19 | (% style="text-align:center" %) | ||
| 20 | [[image:CodeCogsEqn-58.png||height="34" width="181"]] | ||
| 21 | |||
| 22 | low CP values, corresponding to low pair-distance fluctuations, highlight groups of residues that move in a mechanically coordinated way. | ||
| 23 | |||
| 24 | |||
| 25 | = How to use the script = | ||
| 26 | |||
| 27 | • __Requisites__ | ||
| 28 | |||
| 29 | - Python 3.0 (or newer version) | ||
| 30 | |||
| 31 | - Numpy | ||
| 32 | |||
| 33 | - Scipy | ||
| 34 | |||
| 35 | |||
| 36 | • __Usage__ | ||
| 37 | |||
| 38 | The script can analyze a MD trajectory and identify the coordinated motions between residues. It can then filter the output matrix based on the distance to identify long-range coordinated motions. | ||
| 39 | |||
| 40 | The script can work both using only C-alphas (using either a pdb or a xyz file) or the sidechains (using a pdb file). | ||
| 41 | |||
| 42 | For more information run: | ||
| 43 | |||
| 44 | {{{ python3 distance_fluctuation.py -h}}} | ||
| 45 | |||
| 46 | |||
| 47 | ))) | ||
| 48 | |||
| 49 | |||
| 50 | (% class="col-xs-12 col-sm-4" %) | ||
| 51 | ((( | ||
| 52 | {{box title="**Contents**"}} | ||
| 53 | {{toc/}} | ||
| 54 | |||
| 55 | |||
| 56 | {{/box}} | ||
| 57 | |||
| 58 | |||
| 59 | ))) | ||
| 60 | ))) | ||
| 61 | |||
| 62 | • __How to read the output__ | ||
| 63 | |||
| 64 | The script generate different output file. | ||
| 65 | |||
| 66 | |||
| 67 | A) Average Distance (avgdist) file: | ||
| 68 | |||
| 69 | The name of the file is avgdist_out_x_y_type.dat with x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains) | ||
| 70 | |||
| 71 | The file contains a matrix using the residue indexes as axes and the average value of the distance between the residues as the data (r1 r2 avgdist). | ||
| 72 | |||
| 73 | The distance is calculated as the average of the euclidean distance between the residues. | ||
| 74 | |||
| 75 | |||
| 76 | B) Distance Fluctuation (rmsdist) file: | ||
| 77 | |||
| 78 | The name of the file is rmsdist_out_x_y_type.dat with x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains). | ||
| 79 | |||
| 80 | The file contains a matrix using the residue indexes as axes and the distance fluctuation between the residues as the data (r1 r2 rmsdist). | ||
| 81 | |||
| 82 | The distance fluctuation is calculated for the residues that are at least 3 residues away from each other (x-2 to x+2) as follow: | ||
| 83 | |||
| 84 | 1) Calculate the average euclidean distance between the residues (either CA or center of mass) | ||
| 85 | |||
| 86 | 2) Calculate the average distance vector | ||
| 87 | |||
| 88 | 3) Substract the distance fluctuation to the average distance | ||
| 89 | |||
| 90 | 4) Calculate the power of the difference between distance fluctuation and local fluctuation | ||
| 91 | |||
| 92 | 5) Filter the values of the close residues (1-x, x, 1+x) | ||
| 93 | |||
| 94 | 6) Divide the obtained value for the number of frames | ||
| 95 | |||
| 96 | |||
| 97 | C) Profile Sequence (profile_sequence) file: | ||
| 98 | |||
| 99 | The name of the file is profile_sequence_x_y_type.dat with x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains). | ||
| 100 | |||
| 101 | The file is a vector containing the residue number and the local flucutation value. | ||
| 102 | |||
| 103 | The local fluctuation is calculated as the average fluctuation of the residues close to each other (that is the residues ranging from x-2 to x+2 with x = residue number). | ||
| 104 | |||
| 105 | Since the value contains the average distance fluctuation for a range of residues the output starts from residue 4 and ends at residue n-3). | ||
| 106 | |||
| 107 | |||
| 108 | D) Distance Analysis (distance_analysis_out) file: | ||
| 109 | |||
| 110 | The name of the file is distance_analysis_out_c_x_y_type.dat with c = cutoff value, x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains). | ||
| 111 | |||
| 112 | This fail contains the distance fluctuation filtered by the cutoff value (the value are kept if the distance fluctuation is smaller than the cutoff value). | ||
| 113 | |||
| 114 | The cutoff value is calculated as the sum between the nearcutoff and the tolerance value specified by the user. | ||
| 115 | |||
| 116 | The nearcutoff can be calcolated either using sequence proximity or 3d proximity and can be specified by command line using the option -l or ~-~-local (see the command line help for further details). | ||
| 117 | |||
| 118 | In the case of sequence proximity it is calculated as the average of the distance fluctuation value for residues in the range of x-2,x+2 (x = residue index) | ||
| 119 | |||
| 120 | In the case of 3d proximity is calculated as the average distance fluctuation of the residues within a certain radius from the current one (default value = 6.5 A) | ||
| 121 | |||
| 122 | |||
| 123 | E) Blocks Averaging: (rmsdist_out_b) | ||
| 124 | |||
| 125 | It is possible to obtain a distance fluctuation matrix average on protein domains ( or blocks). | ||
| 126 | |||
| 127 | The script requires an input file with the borders of the protein blocks defined by the user, in the form of two columns: the first with the lower limit and the second with the upper limit. |