Changes for page Distance-Fluctuations

Last modified by emacasali on 2022/09/15 13:34

From version 3.1
edited by emacasali
on 2022/09/14 17:25
Change comment: There is no comment for this version
To version 9.1
edited by emacasali
on 2022/09/15 09:27
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -2,9 +2,9 @@
2 2  (((
3 3  (% class="container" %)
4 4  (((
5 -= My Collab's Extended Title =
5 += Distance Fluctuation (DF) Analysis =
6 6  
7 -My collab's subtitle
7 +Giorgio Colombo Group (UNIPV)
8 8  )))
9 9  )))
10 10  
... ... @@ -12,15 +12,80 @@
12 12  (((
13 13  (% class="col-xs-12 col-sm-8" %)
14 14  (((
15 -= What can I find here? =
15 += What DF matrices are? =
16 16  
17 -* Notice how the table of contents on the right
18 -* is automatically updated
19 -* to hold this page's headers
17 +The analysis of the results of a MD simulation can performed using the Distance Fluctuation matrices (DF), based on the Coordination Propensity hypothesis:
20 20  
21 -= Who has access? =
19 +(% style="text-align:center" %)
20 +[[image:CodeCogsEqn-58.png||height="34" width="181"]]
22 22  
23 -Describe the audience of this collab.
22 +low CP values, corresponding to low pair-distance fluctuations, highlight groups of residues that move in a mechanically coordinated way.
23 +
24 +
25 += How to use the script =
26 +
27 +• __Requisites__
28 +
29 + - Python 3.0 (or newer version)
30 +
31 + - Numpy
32 +
33 + - Scipy
34 +
35 +
36 +• __Usage__
37 +
38 +The script can analyze a MD trajectory and identify the coordinated motions between residues. It can then filter the output matrix based on the distance to identify long-range coordinated motions.
39 +
40 +The script can work both using only C-alphas (using either a pdb or a xyz file) or the sidechains (using a pdb file).
41 +
42 +For more information run:
43 +
44 +{{{ python3 distance_fluctuation.py -h}}}
45 +
46 +
47 +{{{optional arguments:
48 + -h, --help show this help message and exit
49 +
50 +Required arguments:
51 + -ext {pdb,xyz}, --file_ext {pdb,xyz}
52 + Input trajectory file iformat (options: .pdb or .xyz)
53 + -n N_AA, --n_aa N_AA Number of aminoacids in each frame of the .pdb
54 + trajectory
55 +
56 +Optional arguments:
57 + -i IN_NAME, --in_name IN_NAME
58 + Name of the trajectory file (default: trj)
59 + -s S_FRAME, --s_frame S_FRAME
60 + Index of the initial frame (default = 0)
61 + -e E_FRAME, --e_frame E_FRAME
62 + Index of the last frame (default = end)
63 + -c CUTOFF, --cutoff CUTOFF
64 + Cutoff for the distance fluctuation analysis (default = 5)
65 + -t TOLERANCE, --tolerance TOLERANCE
66 + Tolerance for the distance fluctuation analysis (default = 0)
67 + -p {all,c-a}, --pdb_type {all,c-a}
68 + Type of pdb file submitted (only C-alpha, all protein atoms)
69 + -l {s,seq,v,volume}, --local {s,seq,v,volume}
70 + Type of local cutoff used, sequence or volume (default = seq)
71 + -r RADIUS, --radius RADIUS
72 + Radius of the area cutoff (default = 7A)
73 + -rs RES_START, --res_start RES_START
74 + Starting residue if the PDB does not start with res.num. 1
75 + -ro RMS_OUT, --rms_out RMS_OUT
76 + RMS distance output filename (without extension)
77 + -ao AVG_OUT, --avg_out AVG_OUT
78 + Average distance output filename (without extension)
79 + -so SEQ_OUT, --seq_out SEQ_OUT
80 + Profile sequence output filename (without extension)
81 + -da DIST_AN, --dist_an DIST_AN
82 + Distance analysis output filename (without extension)
83 + --blocks BLOCKS File with domains borders
84 +
85 + -rb RMS_B_OUT, --rms_b_out RMS_B_OUT
86 + RMS distance blocks output filename}}}
87 +
88 +
24 24  )))
25 25  
26 26  
... ... @@ -28,8 +28,77 @@
28 28  (((
29 29  {{box title="**Contents**"}}
30 30  {{toc/}}
96 +
97 +
31 31  {{/box}}
32 32  
33 33  
34 34  )))
35 35  )))
103 +
104 +• __How to read the output__
105 +
106 +The script generate different output file.
107 +
108 +
109 +A) Average Distance (avgdist) file:
110 +
111 + The name of the file is avgdist_out_x_y_type.dat with x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains)
112 +
113 + The file contains a matrix using the residue indexes as axes and the average value of the distance between the residues as the data (r1 r2  avgdist).
114 +
115 + The distance is calculated as the average of the euclidean distance between the residues.
116 +
117 +
118 +B) Distance Fluctuation (rmsdist) file:
119 +
120 + The name of the file is rmsdist_out_x_y_type.dat with x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains).
121 +
122 + The file contains a matrix using the residue indexes as axes and the distance fluctuation between the residues as the data (r1 r2 rmsdist).
123 +
124 + The distance fluctuation is calculated for the residues that are at least 3 residues away from each other (x-2 to x+2) as follow:
125 +
126 + 1) Calculate the average euclidean distance between the residues (either CA or center of mass)
127 +
128 + 2) Calculate the average distance vector
129 +
130 + 3) Substract the distance fluctuation to the average distance
131 +
132 + 4) Calculate the power of the difference between distance fluctuation and local fluctuation
133 +
134 + 5) Filter the values of the close residues (1-x, x, 1+x)
135 +
136 + 6) Divide the obtained value for the number of frames
137 +
138 +
139 +C) Profile Sequence (profile_sequence) file:
140 +
141 + The name of the file is profile_sequence_x_y_type.dat with x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains).
142 +
143 + The file is a vector containing the residue number and the local flucutation value.
144 +
145 + The local fluctuation is calculated as the average fluctuation of the residues close to each other (that is the residues ranging from x-2 to x+2 with x = residue number).
146 +
147 + Since the value contains the average distance fluctuation for a range of residues the output starts from residue 4 and ends at residue n-3).
148 +
149 +
150 +D) Distance Analysis (distance_analysis_out) file:
151 +
152 + The name of the file is distance_analysis_out_c_x_y_type.dat with c = cutoff value, x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains).
153 +
154 + This fail contains the distance fluctuation filtered by the cutoff value (the value are kept if the distance fluctuation is smaller than the cutoff value).
155 +
156 + The cutoff value is calculated as the sum between the nearcutoff and the tolerance value specified by the user.
157 +
158 + The nearcutoff can be calcolated either using sequence proximity or 3d proximity and can be specified by command line using the option -l or ~-~-local (see the command line help for further details).
159 +
160 + In the case of sequence proximity it is calculated as the average of the distance fluctuation value for residues in the range of x-2,x+2 (x = residue index)
161 +
162 + In the case of 3d proximity is calculated as the average distance fluctuation of the residues within a certain radius from the current one (default value = 6.5 A)
163 +
164 +
165 +E) Blocks Averaging: (rmsdist_out_b)
166 +
167 +It is possible to obtain a distance fluctuation matrix average on protein domains ( or blocks).
168 +
169 +The script requires an input file with the borders of the protein blocks defined by the user, in the form of two columns: the first with the lower limit and the second with the upper limit.
CodeCogsEqn-58.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.emacasali
Size
... ... @@ -1,0 +1,1 @@
1 +7.0 KB
Content
XWiki.XWikiRights[2]
Allow/Deny
... ... @@ -1,0 +1,1 @@
1 +Allow
Groups
... ... @@ -1,0 +1,1 @@
1 +Collabs.distance-fluctuations._.groups.collab-distance-fluctuations-administrator
Levels
... ... @@ -1,0 +1,1 @@
1 +view,comment,edit,delete