Wiki source code of Provenance of simulation and data analysis workflows
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | == Introduction == | ||
| 2 | |||
| 3 | Computational provenance is a record of all the steps in a computational scientific workflow, including the code that was run, input data, the computational environment (hardware, OS, compiler versions, library version...), the person who performed each step, and output data. | ||
| 4 | |||
| 5 | Capturing computational provenance facilitates: | ||
| 6 | |||
| 7 | * reproducibility of results | ||
| 8 | * management and tracking of workflows/projects by the scientists/engineers involved | ||
| 9 | * evaluation/review by other scientists and engineers | ||
| 10 | |||
| 11 | |||
| 12 | |||
| 13 | == Standards == | ||
| 14 | |||
| 15 | The [[W3C PROV standard>>https://www.w3.org/TR/2013/NOTE-prov-overview-20130430/||rel=" noopener noreferrer" target="_blank"]] provides a data model and related tools for provenance interchange on the web. The following diagram shows the three base classes of the PROV data model: Entity, Activity, and Agent. These three classes form the basis for the representation of provenance in the EBRAINS Knowledge Graph: every node in the KG has a type which is a subclass of one of these base classes. | ||
| 16 | |||
| 17 | [[image:starting-points.svg||alt="The three Starting Point classes of the W3C PROV ontology and the properties that relate them."]] | ||
| 18 | |||
| 19 | == Storage of provenance in the Knowledge Graph == | ||
| 20 | |||
| 21 | |||
| 22 | == Tools for automated capture of provenance == | ||
| 23 | |||
| 24 | * on different systems: | ||
| 25 | ** HPC systems | ||
| 26 | ** neuromorphic systems | ||
| 27 | ** Jupyter notebooks | ||
| 28 | ** users' own computers | ||
| 29 | * prospective/pre-emptive vs run-time provenance capture | ||
| 30 | * capture of metadata vs capture of artefacts | ||
| 31 | |||
| 32 | == Communication between computer systems and the KG == | ||
| 33 | |||
| 34 | * local cache and synchronization? | ||
| 35 | |||
| 36 | |||
| 37 | == User interfaces for browsing, visualizing, and searching provenance information == | ||
| 38 | |||
| 39 |