Introduction
Computational provenance is a record of all the steps in a computational scientific workflow, including the code that was run, input data, the computational environment (hardware, OS, compiler versions, library version...), and output data.
Capturing computational provenance facilitates:
- reproducibility of results
- management and tracking of workflows/projects by the scientists/engineers involved
- evaluation/review by other scientists and engineers
Standards
Information about the W3C PROV ontology and related tools
Storage of provenance in the Knowledge Graph
Tools for automated capture of provenance
- on different systems:
- HPC systems
- neuromorphic systems
- Jupyter notebooks
- users' own computers
- prospective/pre-emptive vs run-time provenance capture
- capture of metadata vs capture of artefacts
Communication between computer systems and the KG
- local cache and synchronization?
User interfaces for browsing, visualizing, and searching provenance information