SGA2 SP3 UC002 KR3.2 - Slow Wave Analysis Pipeline

Last modified by robing on 2022/03/25 09:55

Slow Wave Analysis Pipeline (SWAP)

Use Case SGA2-SP3-002 KR3.2: Integrating multi-scale data and the output of simulations in a reproducible and adaptable pipeline

Robin Gutzen1,4, Giulia De Bonis2, Elena Pastorelli2,3, Cristiano Capone2,

Chiara De Luca2,3, Michael Denker1, Sonja Grün1,4,

Pier Stanislao Paolucci2, Andrew Davison5

Experiments: Anna Letizia Allegra Mascaro6,7, Francesco Resta6, Francesco Saverio Pavone6, Maria-Victoria Sanchez-Vives8,9

1) Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA-Institute Brain Structure-Function Relationships (INM-10), Jülich Research Centre, Jülich, Germany

2) Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Roma, Rome, Italy

3) Ph.D. Program in Behavioural Neuroscience, “Sapienza” University of Rome, Rome, Italy

4) Theoretical Systems Neurobiology, RWTH Aachen University, Aachen, Germany

5) Unité de Neurosciences, Information et Complexité, Neuroinformatics Group, CNRS FRE 3693, Gif-sur-Yvette, France

6) European Laboratory for Non-linear Spectroscopy (LENS), University of Florence, Florence, Italy

7) Istituto di Neuroscienze, CNR, Pisa, Italy

8) Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain

9) Institució Catalana de Recerca i Estudis Avanc ̨ats (ICREA), Barcelona, Spain

Flexible workflows to generate multi-scale analysis scenarios

This collab illustrates the usage of the Neo and Elephant tools in performing data analysis with regards to the SGA2-SP3-UC002 KR3.2, investigating sleep, anesthesia, and the transition to wakefulness: see Chapter 1 and Figure 2  of SGA2 Deliverable D3.2.1., for an overview of the scientific motivations and a description of the UseCase workflow; Chapter 2 (same document) for an introduction to KR3.2; Chapter 3, for a description of the mice ECoG data sets; Chapter 5, about the Slow Wave Analysis Pipeline and Chapter 6 for the mice wide-field GECI data). For details on the datasets used in this collab, please see the References below.

See the introduction video

How the pipeline works

The design of the pipeline aims at interfacing a variety of general and specific analysis and processing steps in a flexible modular manner. Hence, it enables the pipeline to adapt to diverse types of data (e.g., electrical ECoG, or optical Calcium Imaging recordings) and to different analysis questions. This makes the analyses a) more reproducible and b) comparable amongst each other since they rely on the same stack of algorithms and any differences in the analysis are fully transparent.
The individual processing and analysis steps (blockssee the arrow-connected elements below) are organized in sequential stages (see the columns below). Following along the stages, the analysis becomes more specific but also allows to branch off at after any stage, as each stage yields useful intermediate results and is autonomous so that it can be reused and recombined. Within each stage, there is a collection of blocks from which the user can select and arrange the analysis via a config file. Thus, the pipeline can be thought of as a curated database of methods on which an analysis can be constructed by drawing a path along the blocks and stages. 

pipeline_flowchart.png

Executing the pipeline

There are two ways of getting started and testing the pipeline, i) online using the collab drive and jupyter hub, or ii) downloading the code and data from GitHub and the collab storage and running it locally.

i) In the collab

  • Copy the collab drive to your personal drive space

    • Open the Drive from the left menu
    • Select the folders pipeline and datasets,
      and the notebook run_snakemake_in_collab.ipynb
    • Select 'Copy', and then 'My Library' from the dropdown 'Other Libraries'
       
  • Start a Jupyter Hub instance 
    In another browser tab, open https://lab.ebrains.eu
     
  • Edit the config files
    Each stage has config files (pipeline/<stage_name>/configs/config_<profile>.yaml) to specify which analysis/processing blocks to execute and which parameters to use. General and specific information about the blocks and parameters can be found in the README and config files of each stage. There are preset configuration profiles for the benchmark datasets IDIBAPS (ECoG, anesthetized mouse) and LENS (Calcium Imaging, anesthetized mouse).
     
  • Run the notebook
    In the jupyter hub, navigate to drive/My Libraries/My Library/run_snakemake_in_collab.ipynb, or where you copied the file to.
    Follow the notebook to install the required packages into your Python kernel, set the output path, and execute the pipeline with snakemake.

ii) Local execution

tested only with Mac OS and Linux!

  • Get the code
    The source code of the pipeline is available via Github: INM-6/wavescalephant and can be cloned to your machine (how to get started with Github).
     
  • Build the Python environment
    In the wavescalephant git repository, there is an environment file (pipeline/environment.yaml) specifying the required packages and versions. To build the environment, we recommend using conda (how to get started with conda).
    conda env create --file environment.yaml
    conda activate wavescalephant_env

    Make sure that neo and elephant were installed as their Github development version, and if necessary add them manually to the environment.
    pip install git+https://github.com/NeuralEnsemble/elephant.git
    pip install git+https://github.com/NeuralEnsemble/python-neo.git

     

  • Edit the settings
    The settings file specifies the path to the output folder, where results are saved to. Open the template file pipeline/settings_template.py, set the output_path to the desired path, and save it as pipeline/settings.py.
     
  • Edit the config files
    Each stage uses a config file to specify which analysis/processing blocks to execute and which parameters to use. Edit the config template files pipeline/stageXX_<stage_name>/configs/config_template.yaml according to your dataset and analysis goal, and save them as pipeline/stageXX_<stage_name>/configs/config_<profile>.yaml. A detailed description of the available parameter settings and their meaning is commented in the template files, and a more general description of the working mechanism of each stage can be found in the respective README file pipeline/stageXX_<stage_name>/README.md.
    Links are view-only
  • Enter a dataset
    There are two test datasets in the collab drive (IDIBAPS and LENS) for which there are also corresponding config files and scripts in the data_entry stage. So, these datasets are ready to be used and analyzed.
    For adding new datasets see pipeline/stage01_data_entry/README.md
     
  • Run the pipeline (-stages)
    To run the pipeline with snakemake), activate the Python environment conda activate wavescalephant_env, make sure you are in the working directory (pipeline/), and call snakemake to run the entire pipeline.
    For a more detailed executed guide and how to execute individual stages and blocks see the pipeline Readme.

Accessing and using the results

All results are stored in the path specified in the settings.py file. The folder structure reflects the structuring of the pipeline into stages and blocks. All intermediate results are stored as .nix files using the Neo data format and can be loaded with neo.NixIO('/path/to/file.nix').read_block(). Additionally, most blocks produce a figure, and each stage a report file, to give an overview of the execution log, parameters, intermediate results, and to help with debugging. The final stage (stage05_wave_characterization) stores the results as pandas.DataFrames in .csv files, separately for each measure as well as in a combined dataframe for all measures.

Examples of the output figures (for IDIBAPS dataset)

Outlook

  • Using the KnowledgeGraph API to insert data directly from the Knowledge Graph into the pipeline and also register and store the corresponding results as Analysis Objects. Such Analysis Objects are to incorporate Provenance Tracking, using fairgraph, to record the details of the processing and analysis steps.
  • Adding support for the pipeline to make use of HPC resources when running on the collab.
  • Further extending the available methods to address a wider variety of analysis objectives and support the processing of other datatypes. Additional documentation and guides should also make it easier for non-developers to contribute new method blocks.
  • Extending the application of the pipeline to the analysis of other types of activity waves and oscillations.
  • Integrating and co-developing new features of the underlying software tools Elephant, Neo, Nix, Snakemake.

References

Code developed at:

https://github.githubassets.com/images/modules/logos_page/GitHub-Mark.pngINM-6/wavescalephant

License

Text is licensed under the Creative Commons CC-BY 4.0 license. LENS data is licensed under the Creative Commons CC-BY-NC-ND 4.0 license. IDIBAPS data is licensed under the Creative Commons CC-BY-NC-SA 4.0 license. Software code is licensed under GNU General Public License v3.0.

https://i.creativecommons.org/l/by/4.0/88x31.png

https://i.creativecommons.org/l/by/4.0/88x31.png

https://i.creativecommons.org/l/by/4.0/88x31.png

Acknowledgments

This open source software code was developed in part or in whole in the Human Brain Project, funded from the European Union’s Horizon 2020 Framework Programme for Research and Innovation under the Specific Grant Agreement No. 785907 (Human Brain Project SGA2).

Logos SP3 Use Case 2