Preparing your samples ====================== .. The dataset status table below relies on raw HTML/JavaScript to fetch and render live status from the GitLab API. That section will be empty or broken in non-HTML output formats (PDF, LaTeX, man pages) and requires internet access to display. RAFT supports both user-provided samples and off-the-shelf immuno-oncology datasets. Making your files available to RAFT ----------------------------------- RAFT starts each workflow using sample-associated data. These data could be FASTQs, BAMs, VCFs, or RNA count data. Regardless of the type of input data, they all need to be made discoverable by RAFT. Users can make input files discoverable by RAFT by copying or symlinking them into the RAFT ``inputs/`` directories, for example ``/path/to/raft/inputs/fastqs``. Grouping input files by dataset, such as ``/path/to/raft/inputs/fastqs/my_favorite_dataset/samp1_1.fq.gz``, is recommended but not required. .. note:: Input data for demonstration workflows and off-the-shelf datasets are downloaded automatically. The RAFT manifest ----------------- A manifest describes the samples in a dataset and tells RAFT: - how the samples should be named - how the samples are related - what input files are associated with each sample For user-provided datasets, you supply this manifest directly. For supported off-the-shelf datasets, RAFT can generate the run manifest automatically as part of dataset preparation. Using an off-the-shelf dataset ------------------------------ The table below lists the currently supported off-the-shelf immuno-oncology datasets. Click a dataset name to view its publication, abstract, and README. .. raw:: html
Loading datasets from GitLab...
.. toctree:: :hidden: dataset-detail Specifying an off-the-shelf dataset with RAFT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To create a new project from an off-the-shelf dataset, simply pass ``--dataset`` and the dataset's identifier to RAFT: .. code-block:: console $ raft run \ --project-id my-project \ --workflow lens \ --dataset \ --version v1.9-dev \ RAFT will clone the dataset-prep module, download the required FASTQs from ENA/EBI, and stage them before execution. No user-provided manifest is required; the run manifest is generated automatically. Using your own samples ---------------------- Running your own samples through RAFT requires creating a manifest for your dataset. RAFT provides a `web-based interface `_ to help generate that manifest. This interface becomes available when you execute ``raft run`` on a new project for the first time. .. figure:: _static/manifest-generator.png :alt: RAFT manifest generator interface The RAFT manifest generator interface. If you prefer to create the manifest separately and then provide it with ``raft run ... --manifest ``, you can use the hosted manifest generator `here `_. More information about the manifest format is available in :doc:`manifest`. Specifying your own samples with RAFT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If you already have a manifest, create a new project by passing ``--manifest`` and the manifest filename. .. note:: Manifest TSVs must be in your RAFT's ``inputs/metadata`` directory to be discovered by RAFT. .. code-block:: console $ raft run \ --project-id my-project \ --workflow lens \ --version v1.9-dev \ --manifest Alternatively, if you want to make your manifest as part of running RAFT, then simply run RAFT without ``--manifest``: .. code-block:: console $ raft run \ --project-id my-project \ --workflow lens \ --version v1.9-dev \