Manifest Specifications ======================= Purpose ------- A RAFT manifest describes the samples a workflow should process and the relationships among datasets, patients, runs, and input file prefixes. For most users, a manifest is a tab-delimited table with one row per run. A typical paired tumor-normal example looks like: .. code-block:: text Patient_Name Run_Name Dataset File_Prefix Sequencing_Method Normal Pt01 ad-Pt01 MyStudy Pt01_tumor_dna WES FALSE Pt01 nd-Pt01 MyStudy Pt01_normal_dna WES TRUE The ``Run_Name`` prefix (``ad-``, ``nd-``, ``ar-``, ``nr-``) tells RAFT the sample type. See `Run-name prefixes`_ below for the full list. RAFT validates manifests before execution with: .. code-block:: console $ raft check-manifest --manifest my_manifest.tsv Making a manifest ----------------- When you launch ``raft run`` on a new project without ``--manifest``, RAFT can open the manifest generator UI and let you build a manifest interactively. If you want to prepare a manifest ahead of time, you can use the hosted `browser-based manifest generator `_. That interface can export either: - a plain-text TSV file - an encoded string that can be pasted back into RAFT when prompted For command-line work, ``raft manifest-template`` prints a starter TSV: .. code-block:: console $ raft manifest-template Write it directly to a file with: .. code-block:: console $ raft manifest-template --output my_manifest.tsv Use ``--no-examples`` when you only want the header row: .. code-block:: console $ raft manifest-template --no-examples --output my_manifest.tsv Workflow owners can also provide a ``manifest-ui.json`` file that defines the columns, allowed values, and sample-type prefixes for their workflow. Use that file when you want the CLI template to match a workflow-specific manifest UI: .. code-block:: console $ raft manifest-template \ --manifest-ui-template path/to/workflow.manifest-ui.json \ --output my_manifest.tsv If you create a manifest outside RAFT, pass it to ``raft run`` with ``--manifest``. Technical details ----------------- This section documents the current RAFT manifest format. Required columns ~~~~~~~~~~~~~~~~ By default, RAFT requires these tab-delimited column names: .. list-table:: Required RAFT columns :widths: 25 50 25 :header-rows: 1 * - Column - Description - Allowed values * - ``Patient_Name`` - Patient identifier - Free text * - ``Dataset`` - Dataset or cohort identifier - Free text * - ``Run_Name`` - Sample or run identifier - Free text with a RAFT sample prefix * - ``Sequencing_Method`` - Sequencing protocol - ``RNA-Seq``, ``RNA``, ``WES``, ``WXS``, ``WGS`` * - ``Normal`` - Whether the sample is normal - ``TRUE`` or ``FALSE`` * - ``File_Prefix`` - Base name or path prefix for input data - Free text The header names are underscore-delimited. Use the exact names shown above. For workflow-specific manifests, ``raft check-manifest`` can use a ``manifest-ui.json`` file as the source for required columns, allowed values, and sample-type prefix rules. When the manifest lives inside a RAFT project, ``check-manifest`` auto-detects a single workflow ``*.manifest-ui.json`` file under that project. You can also provide one explicitly: .. code-block:: console $ raft check-manifest \ --manifest my_manifest.tsv \ --manifest-ui-template path/to/workflow.manifest-ui.json The manifest UI rules cover table structure and per-row values. ``raft check-manifest`` still performs CLI-only cross-row checks, including duplicate sample types per patient/group and HLA consistency. Optional columns ~~~~~~~~~~~~~~~~ RAFT also recognizes several optional columns used by some workflows and UI paths: - ``Group``: groups related samples or timepoints for the same patient - ``Alleles``: stores HLA alleles for workflows and viewers that use class-I HLA information ``Group`` may contain one group name or multiple group names separated with hyphens. When ``Group`` is absent, RAFT treats all rows as belonging to one default group for validation. Run-name prefixes ~~~~~~~~~~~~~~~~~ RAFT uses the prefix of ``Run_Name`` to infer sample type: - ``ad-``: abnormal DNA - ``ar-``: abnormal RNA - ``nd-``: normal DNA - ``nr-``: normal RNA RAFT checks that these prefixes agree with the ``Normal`` column. The current prefix rules are: - ``ad-`` and ``ar-`` rows must have ``Normal`` set to ``FALSE`` - ``nd-`` and ``nr-`` rows must have ``Normal`` set to ``TRUE`` - any other prefix is rejected by ``raft check-manifest`` HLA alleles ~~~~~~~~~~~ When present, the ``Alleles`` column must contain six comma-separated class-I HLA alleles: .. code-block:: text HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01 RAFT normalizes allele values by trimming whitespace, uppercasing, and removing ``*`` characters. Each allele must match ``HLA-A``, ``HLA-B``, or ``HLA-C`` with two field resolution, such as ``HLA-A01:01``. Each row must include exactly two ``HLA-A``, two ``HLA-B``, and two ``HLA-C`` alleles. Per-patient consistency rules: - If alleles are provided for ordinary multi-sample rows, all rows for that patient must provide the same allele set. - If alleles are provided only on ``ar-`` rows, that is allowed. - If a patient has multiple ``ar-`` rows and uses the ``ar-``-only exception, every ``ar-`` row must include alleles. - Multiple ``ar-`` rows with alleles must use the same allele set. Example manifest ~~~~~~~~~~~~~~~~ .. code-block:: text Patient_Name Run_Name Dataset File_Prefix Sequencing_Method Normal Group Alleles Pt01 ad-Pt01-pre AML Pt01_pre_dna WES FALSE pre HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01 Pt01 nd-Pt01-pre AML Pt01_norm_dna WES TRUE pre HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01 Pt01 ar-Pt01-pre AML Pt01_pre_rna RNA-Seq FALSE pre HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01 Pt01 ar-Pt01-post AML Pt01_post_rna RNA-Seq FALSE post HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01 Validation behavior ~~~~~~~~~~~~~~~~~~~ ``raft check-manifest`` currently checks for: - missing required columns - tab-delimited formatting - allowed values from ``manifest-ui.json`` for columns such as ``Sequencing_Method`` and ``Normal`` - valid run-name prefixes from ``manifest-ui.json`` sample types - consistency between sample-type prefixes and ``Normal`` when sample types define a ``normal`` value - duplicate sample types within a patient, or within a patient and group - optional HLA allele formatting and consistency checks when ``Alleles`` is present Rows with a blank ``File_Prefix`` are excluded from patient and group sample-count checks and are reported separately. Without a workflow-specific ``manifest-ui.json``, ``raft manifest-template`` includes required columns plus the optional ``Group`` and ``Alleles`` columns because those optional columns are useful for LENS and other immunology workflows. Delete optional columns only if the target workflow does not use them. Validation accepts an absolute path, a path relative to the current RAFT workspace, or a file/path under the configured RAFT metadata directory. For example, all of these are valid when the files exist: .. code-block:: console $ raft check-manifest --manifest /path/to/raft/projects/my-project/inputs/metadata/my_manifest.tsv $ raft check-manifest --manifest projects/my-project/inputs/metadata/my_manifest.tsv $ raft check-manifest --manifest my_manifest.tsv The final form searches the configured metadata directory recursively. If multiple files with the requested name are found, use a more specific path. Using manifests with workflows ------------------------------ A manifest can be supplied directly to ``raft run``: .. code-block:: console $ raft run \ --project-id my-project \ --workflow lens \ --version v1.9-dev \ --manifest my_manifest.tsv If you omit ``--manifest`` when running ``raft run`` on a new project, RAFT can launch the manifest generator UI instead. Next steps ---------- - :doc:`ui` for the manifest generator and other browser-based tools - :doc:`workflows` for workflow selection - :doc:`outputs` for the structure of workflow outputs