From BeyondPlanck to Cosmoglobe: Open Science, Reproducibility, and Data Longevity
The BeyondPlanck and Cosmoglobe collaborations have implemented the first integrated Bayesian end-to-end analysis pipeline for CMB experiments. The primary long-term motivation for this work is to develop a common analysis platform that supports efficient global joint analysis of complementary radio, microwave, and sub-millimeter experiments. A strict prerequisite for this to succeed is broad participation from the CMB community, and two foundational aspects of the program are therefore reproducibility and Open Science. In this paper, we discuss our efforts toward this aim. We also discuss measures toward facilitating easy code and data distribution, community-based code documentation, user-friendly compilation procedures, etc. This work represents the first publicly released end-to-end CMB analysis pipeline that includes raw data, source code, parameter files, and documentation. We argue that such a complete pipeline release should be a requirement for all major future and publicly-funded CMB experiments, noting that a full public release significantly increases data longevity by ensuring that the data quality can be improved whenever better processing techniques, complementary datasets, or more computing power become available, and thereby also taxpayers’ value for money; providing only raw data and final products is not sufficient to guarantee full reproducibility in the future.