Provenance of simulation and data analysis workflows

Version 4.2 by adavison on 2020/08/05 08:40

Introduction

Computational provenance is a record of all the steps in a computational scientific workflow, including the code that was run, input data, the computational environment (hardware, OS, compiler versions, library version...), and output data.

Capturing computational provenance facilitates:

reproducibility of results
management and tracking of workflows/projects by the scientists/engineers involved
evaluation/review by other scientists and engineers

Standards

Information about the W3C PROV ontology and related tools

Storage of provenance in the Knowledge Graph

Tools for automated capture of provenance

on different systems:
- HPC systems
- neuromorphic systems
- Jupyter notebooks
- users' own computers
prospective/pre-emptive vs run-time provenance capture
capture of metadata vs capture of artefacts

Communication between computer systems and the KG

local cache and synchronization?

Provenance of simulation and data analysis workflows

Introduction

Standards

Storage of provenance in the Knowledge Graph

Tools for automated capture of provenance

Communication between computer systems and the KG

User interfaces for browsing, visualizing, and searching provenance information

Provenance of simulation and data analysis workflows