Wiki source code of Benchmarking
Hide last authors
author | version | line-number | content |
---|---|---|---|
![]() |
1.1 | 1 | == WIP == |
2 | |||
3 | == TVB-INVERSION 1.0.0 == | ||
4 | |||
5 | === Sampling priors - Remote execution === | ||
6 | |||
7 | ==== ==== | ||
8 | |||
9 | ==== 1. Execution times ==== | ||
10 | |||
11 | In this section we provide some benchmarks regarding the execution times for the sampling priors step within tvb-inversion workflow, where we need to run a large number of simulations. | ||
12 | |||
13 | These have been computed on DAINT-CSCS HPC, on a single node, with different configurations for the number of simulations and the number of workers. | ||
14 | |||
![]() |
2.1 | 15 | |
![]() |
1.1 | 16 | |=(% scope="row" %)((( |
17 | Model | ||
18 | )))|=((( | ||
19 | Sim length (s) | ||
![]() |
3.1 | 20 | )))|=Regions|=Nr simulations|=Nr workers|=Execution time (hh:mm) |
![]() |
1.1 | 21 | |=MontbrioPazoRoxin|30|100|((( |
22 | 30 | ||
23 | )))|30|00:17 | ||
24 | |=MontbrioPazoRoxin|30|100|200|20|01:08 | ||
25 | |=MontbrioPazoRoxin|30|100|300|30|01:10 | ||
26 | |=MontbrioPazoRoxin|30|100|400|40|01:18 | ||
27 | |=MontbrioPazoRoxin|30|100|500|50|01:34 | ||
28 | |=MontbrioPazoRoxin|30|100|500|55|01:30 | ||
29 | |=MontbrioPazoRoxin|30|100|600|55|01:45 | ||
30 | |=MontbrioPazoRoxin|30|100|600|60|OOM | ||
![]() |
5.1 | 31 | |=MontbrioPazoRoxin|60|100|500|40|((( |
32 | 03:07 | ||
![]() |
1.1 | 33 | ))) |
![]() |
6.1 | 34 | |(% scope="row" %)MontbrioPazoRoxin|60|100|500|55|OOM |
![]() |
2.1 | 35 | | | | | | | |
36 | | | | | | | | ||
37 | | | | | | | | ||
38 | | | | | | | | ||
39 | | | | | | | | ||
40 | | | | | | | | ||
41 | | | | | | | | ||
42 | | | | | | | | ||
43 | | | | | | | | ||
44 | | | | | | | | ||
45 | | | | | | | | ||
46 | | | | | | | | ||
47 | | | | | | | | ||
48 | | | | | | | | ||
![]() |
1.1 | 49 | |
![]() |
6.1 | 50 | (% border="2" cellspacing="10" style="margin-left:auto; margin-right:auto" %) |
51 | |=(% scope="row" %)((( | ||
52 | Model | ||
53 | )))|=((( | ||
54 | Sim length (s) | ||
55 | )))|=Regions|=Nr simulations|=Nr workers|=Execution time (hh:mm) | ||
56 | |=MontbrioPazoRoxin|30|100|((( | ||
57 | 30 | ||
58 | )))|30|00:17 | ||
59 | |=MontbrioPazoRoxin|30|100|200|20|01:08 | ||
60 | |=MontbrioPazoRoxin|30|100|300|30|01:10 | ||
61 | |=MontbrioPazoRoxin|30|100|400|40|01:18 | ||
62 | |=MontbrioPazoRoxin|30|100|500|50|01:34 | ||
63 | |=MontbrioPazoRoxin|30|100|500|55|01:30 | ||
64 | |=MontbrioPazoRoxin|30|100|600|55|01:45 | ||
65 | |=MontbrioPazoRoxin|30|100|600|60|OOM | ||
66 | |=MontbrioPazoRoxin|60|100|500|40|((( | ||
67 | 03:07 | ||
68 | ))) | ||
69 | |(% scope="row" %)**MontbrioPazoRoxin**|60|100|500|55|OOM | ||
70 | |= |= |= |= |= |= | ||
71 | |= | | | | | | ||
72 | |= | | | | | | ||
73 | |= | | | | | | ||
74 | |= | | | | | | ||
75 | |= | | | | | | ||
76 | |= | | | | | | ||
77 | |= | | | | | | ||
78 | |= | | | | | | ||
79 | |= | | | | | | ||
80 | |= | | | | | | ||
81 | |= | | | | | | ||
82 | |= | | | | | | ||
83 | |= | | | | | | ||
84 | |= | | | | | | ||
![]() |
1.1 | 85 | |
86 | ==== 2. Limitations ==== | ||
87 | |||
88 | * Reaching he memory limit on the CSCS node. | ||
![]() |
3.1 | 89 | ** For a simulation of 30 seconds, we can fit 55 parallel workers in the available memory |
90 | ** For a simulation of 60 seconds, we can fit 40 parallel workers in the available memory | ||
![]() |
1.1 | 91 | * Reaching the maximum connections on CSCS could happen during a run |
92 | ** connect once to launch the job | ||
93 | ** connect multiple times during the monitoring step to check the status of the job | ||
94 | ** connect once to stage out results |