Changes for page Distance-Fluctuations
Last modified by emacasali on 2022/09/15 13:34
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -42,6 +42,50 @@ 42 42 For more information run: 43 43 44 44 {{{ python3 distance_fluctuation.py -h}}} 45 + 46 + 47 +{{{optional arguments: 48 + -h, --help show this help message and exit 49 + 50 +Required arguments: 51 + -ext {pdb,xyz}, --file_ext {pdb,xyz} 52 + Input trajectory file iformat (options: .pdb or .xyz) 53 + -n N_AA, --n_aa N_AA Number of aminoacids in each frame of the .pdb 54 + trajectory 55 + 56 +Optional arguments: 57 + -i IN_NAME, --in_name IN_NAME 58 + Name of the trajectory file (default: trj) 59 + -s S_FRAME, --s_frame S_FRAME 60 + Index of the initial frame (default = 0) 61 + -e E_FRAME, --e_frame E_FRAME 62 + Index of the last frame (default = end) 63 + -c CUTOFF, --cutoff CUTOFF 64 + Cutoff for the distance fluctuation analysis (default = 5) 65 + -t TOLERANCE, --tolerance TOLERANCE 66 + Tolerance for the distance fluctuation analysis (default = 0) 67 + -p {all,c-a}, --pdb_type {all,c-a} 68 + Type of pdb file submitted (only C-alpha, all protein atoms) 69 + -l {s,seq,v,volume}, --local {s,seq,v,volume} 70 + Type of local cutoff used, sequence or volume (default = seq) 71 + -r RADIUS, --radius RADIUS 72 + Radius of the area cutoff (default = 7A) 73 + -rs RES_START, --res_start RES_START 74 + Starting residue if the PDB does not start with res.num. 1 75 + -ro RMS_OUT, --rms_out RMS_OUT 76 + RMS distance output filename (without extension) 77 + -ao AVG_OUT, --avg_out AVG_OUT 78 + Average distance output filename (without extension) 79 + -so SEQ_OUT, --seq_out SEQ_OUT 80 + Profile sequence output filename (without extension) 81 + -da DIST_AN, --dist_an DIST_AN 82 + Distance analysis output filename (without extension) 83 + --blocks BLOCKS File with domains borders 84 + 85 + -rb RMS_B_OUT, --rms_b_out RMS_B_OUT 86 + RMS distance blocks output filename}}} 87 + 88 + 45 45 ))) 46 46 47 47 ... ... @@ -57,4 +57,69 @@ 57 57 ))) 58 58 ))) 59 59 60 - 104 +• __How to read the output__ 105 + 106 +The script generate different output file. 107 + 108 + 109 +A) Average Distance (avgdist) file: 110 + 111 + The name of the file is avgdist_out_x_y_type.dat with x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains) 112 + 113 + The file contains a matrix using the residue indexes as axes and the average value of the distance between the residues as the data (r1 r2 avgdist). 114 + 115 + The distance is calculated as the average of the euclidean distance between the residues. 116 + 117 + 118 +B) Distance Fluctuation (rmsdist) file: 119 + 120 + The name of the file is rmsdist_out_x_y_type.dat with x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains). 121 + 122 + The file contains a matrix using the residue indexes as axes and the distance fluctuation between the residues as the data (r1 r2 rmsdist). 123 + 124 + The distance fluctuation is calculated for the residues that are at least 3 residues away from each other (x-2 to x+2) as follow: 125 + 126 + 1) Calculate the average euclidean distance between the residues (either CA or center of mass) 127 + 128 + 2) Calculate the average distance vector 129 + 130 + 3) Substract the distance fluctuation to the average distance 131 + 132 + 4) Calculate the power of the difference between distance fluctuation and local fluctuation 133 + 134 + 5) Filter the values of the close residues (1-x, x, 1+x) 135 + 136 + 6) Divide the obtained value for the number of frames 137 + 138 + 139 +C) Profile Sequence (profile_sequence) file: 140 + 141 + The name of the file is profile_sequence_x_y_type.dat with x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains). 142 + 143 + The file is a vector containing the residue number and the local flucutation value. 144 + 145 + The local fluctuation is calculated as the average fluctuation of the residues close to each other (that is the residues ranging from x-2 to x+2 with x = residue number). 146 + 147 + Since the value contains the average distance fluctuation for a range of residues the output starts from residue 4 and ends at residue n-3). 148 + 149 + 150 +D) Distance Analysis (distance_analysis_out) file: 151 + 152 + The name of the file is distance_analysis_out_c_x_y_type.dat with c = cutoff value, x = start frame of the analysis, y = end frame of the analysis, type = type of analysis (CA or sidechains). 153 + 154 + This fail contains the distance fluctuation filtered by the cutoff value (the value are kept if the distance fluctuation is smaller than the cutoff value). 155 + 156 + The cutoff value is calculated as the sum between the nearcutoff and the tolerance value specified by the user. 157 + 158 + The nearcutoff can be calcolated either using sequence proximity or 3d proximity and can be specified by command line using the option -l or ~-~-local (see the command line help for further details). 159 + 160 + In the case of sequence proximity it is calculated as the average of the distance fluctuation value for residues in the range of x-2,x+2 (x = residue index) 161 + 162 + In the case of 3d proximity is calculated as the average distance fluctuation of the residues within a certain radius from the current one (default value = 6.5 A) 163 + 164 + 165 +E) Blocks Averaging: (rmsdist_out_b) 166 + 167 +It is possible to obtain a distance fluctuation matrix average on protein domains ( or blocks). 168 + 169 +The script requires an input file with the borders of the protein blocks defined by the user, in the form of two columns: the first with the lower limit and the second with the upper limit.