The submission to evaluate the reproducibility consists of one or several artifacts. An article can be accompanied by several computational artifacts that extend beyond the submitted article itself and that contain all the required elements to reproduce the experiments. Therefore, artifacts comprise software, datasets, environment configuration, mechanized proofs, benchmarks, test suites with scripts, etc. A complete artifact package must contain (1) the computational artifacts and (2) instructions/documentation describing the contents and how to use them. Regarding the artifact, in particular the code and the datasets, scripts should be provided to support the compilation, deployment, and execution to support the reproducibility of experiments.
To start the Reproducibility Evaluation Process in TPDS, authors must create the required artifacts for their article. An artifact must be accessible via a permanently persistent and publicly shareable DOI and made available via standard open licenses that maximize artifact availability. Authors must provide links to their artifact on a hosting platform that supports persistent DOIs and versioning (for example, CodeOcean, DataPort, Dryad, FigShare, Harvard Dataverse, or Zenodo). Authors should not provide links or zipped files hosted through personal webpages or shared collaboration platforms, such as Next Cloud, Google Drive, or Dropbox.
Zenodo and FigShare provide an integration with GitHubto automatically generate DOIs from Git tags. Therefore, it is possible to host code using version control provided by GitHub and describe the artifact using Zenodo or FigShare. Please observe that Git itself (or any other control versioning software) does not generate a DOI, and it needs to be paired with Zenodo or FigShare. CodeOcean, on the other hand, in addition to providing a DOI, generates a “compute capsule” (with version control) that includes the artifact and its description.
The artifact description must be included in a README file along with the artifact, and it must include the following aspects:
Artifact Identification: Including (i) the article’s title, (ii) the author’s names and affiliations, and (iii) an abstract describing the main contributions of the article and how the role of the artifact in these contributions. The abstract may include a software architecture or data models and its description to help the reader understand the artifact and a clear description on to what extent the artifact contributes to the reproducibility of the experiments in the article.
Artifact Dependencies and Requirements: Including (i) a description of the hardware resources required, (ii) a description of the operating systems required, (iii) the software libraries needed, (iv) the input dataset needed to execute the code or when the input data is generated, and (v) optionally, any other dependencies or requirements. Best practices to facilitate the understanding of the descriptions indicate that unnecessary dependencies and requirements should be suppressed from the artifact.
Artifact Installation and Deployment Process: Including (i) the process description to install and compile the libraries and the code, and (ii) the process description to deploy the code in the resources. The description of these processes should include an estimation of the installation, compilation, and deployment times. When any of these times exceed what is reasonable, authors should provide some way to alleviate the effort required by the potential recipients of the artifacts. For instance, capsules with the compiled code can be provided, or a simplified input dataset that reduces the overall experimental execution time. On the other hand, best practices indicate that, whenever it is possible, the actual code of software dependencies (libraries) should not be included in the artifact, but scripts should be provided to download them from a repository and perform the installation.
Reproducibility of Experiments: Including (i) a complete description of the experiment workflow that the code can execute, (ii) an estimation of the execution time to execute the experiment workflow, (iii) a complete description of the expected results and an evaluation of them, and most importantly (iv) how the expected results from the experiment workflow relate to the results found in the article. Best practices indicate that, to facilitate the understanding of the scope of the reproducibility, the expected results from the artifact should be in the same format as the ones in the article. For instance, when the results in the article are depicted in a graph figure, ideally, the execution of the code should provide a (similar) figure (there are open-source tools that can be used for that purpose such as gnuplot). It is critical that authors devote their efforts on these aspects of the reproducibility of experiments to minimize the time needed for their understanding and verification.
Other notes: Including other related aspects that can be important and were not addressed in the previous points.
In addition to quality descriptions of the artifacts and data and code repositories, adequate computational resources are necessary to reproduce the experiments. This can be particularly challenging, given the inherent complexities of parallel and distributed infrastructures. Not only do the provisioned resources need to meet the experiment requirements–which, in some cases, might be very specific and rare–but they also need to be configured and interconnected properly.
Instructions and scripts to define and configure the computational infrastructure should be provided to ease reproducibility. Ideally, authors should provide their artifacts and scripts already with a target infrastructure in mind. Although one can reproduce the experiments in institutional, owned premises, there are already some initiatives for reproducibility that offer their computational resources to reproduce the experiments. Given current heterogeneity and complexity of hardware, it seems that promoting a set of community well-known reproducibility infrastructures to execute experiments can perhaps simplify the reproducibility processes.
CodeOceanis a reproducibility platform. It provides a containerized approach to run artifacts on demand within CodeOcean resources. Nevertheless, currently, CodeOcean only supports the execution of a process within a container, which means that, in general terms, parallel and distributed systems cannot be executed on it. This represents a major limitation for most of the experiments of TPDS articles.
An alternative platform more suitable for parallel and distributed experiments is Chameleon. Chameleon allows users to configure a distributed infrastructure and execute experiments on multiple bare metal or KVM virtualized machines that are interconnected through a communication network. It gives users full control of the software stack, including root privileges, kernel customization, and console access.
Furthermore, Chameleon provides usage metrics, such as the number of times a particular artifact was run. This reproducibility metrics can play a similar role as current impact metrics associated with the articles, such as number of citations.
HOW TO USE CHAMELEON FOR TPDS
Option 1) Chameleon’s GUI and ssh
Chameleon offers a GUI to search, book, and configure the computational resources–both machines and network. Then, once machines are allocated and ready, they can be accessed through their IP address and the ssh protocol. Then, appropriate scripts can be provided to deploy the required datasets and software to execute the experiments.
Option 2) Deploy the artifacts at Chameleon’s Trovi
Alternatively, authors can use Chameleon’s Trovi. Trovi is a platform for sharing and reproducing research artifacts. It provides a REST API for use by various clients and stores the artifacts that authors upload. Then, artifact recipients can execute the artifacts directly in Chameleon.
REQUESTING ACCESS TO CHAMELEON
Access to Chameleon can be easily obtained through federated institutional login, Gmail accounts, ORCiD, or TAS, as well as by creating an account. Then, access Chameleon resources must be formally requested inside the platform by requesting the PI eligibility role.