Manifest Specifications
Purpose
A RAFT manifest describes the samples a workflow should process and the relationships among datasets, patients, runs, and input file prefixes.
For most users, a manifest is a tab-delimited table with one row per run. A typical paired tumor-normal example looks like:
Patient_Name Run_Name Dataset File_Prefix Sequencing_Method Normal
Pt01 ad-Pt01 MyStudy Pt01_tumor_dna WES FALSE
Pt01 nd-Pt01 MyStudy Pt01_normal_dna WES TRUE
The Run_Name prefix (ad-, nd-, ar-, nr-) tells RAFT the
sample type. See Run-name prefixes below for the full list.
RAFT validates manifests before execution with:
$ raft check-manifest --manifest my_manifest.tsv
Making a manifest
When you launch raft run on a new project without --manifest, RAFT can
open the manifest generator UI and let you build a manifest interactively.
If you want to prepare a manifest ahead of time, you can use the hosted browser-based manifest generator. That interface can export either:
a plain-text TSV file
an encoded string that can be pasted back into RAFT when prompted
For command-line work, raft manifest-template prints a starter TSV:
$ raft manifest-template
Write it directly to a file with:
$ raft manifest-template --output my_manifest.tsv
Use --no-examples when you only want the header row:
$ raft manifest-template --no-examples --output my_manifest.tsv
Workflow owners can also provide a manifest-ui.json file that defines the
columns, allowed values, and sample-type prefixes for their workflow. Use that
file when you want the CLI template to match a workflow-specific manifest UI:
$ raft manifest-template \
--manifest-ui-template path/to/workflow.manifest-ui.json \
--output my_manifest.tsv
If you create a manifest outside RAFT, pass it to raft run with
--manifest.
Technical details
This section documents the current RAFT manifest format.
Required columns
By default, RAFT requires these tab-delimited column names:
Column |
Description |
Allowed values |
|---|---|---|
|
Patient identifier |
Free text |
|
Dataset or cohort identifier |
Free text |
|
Sample or run identifier |
Free text with a RAFT sample prefix |
|
Sequencing protocol |
|
|
Whether the sample is normal |
|
|
Base name or path prefix for input data |
Free text |
The header names are underscore-delimited. Use the exact names shown above.
For workflow-specific manifests, raft check-manifest can use a
manifest-ui.json file as the source for required columns, allowed values,
and sample-type prefix rules. When the manifest lives inside a RAFT project,
check-manifest auto-detects a single workflow *.manifest-ui.json file
under that project. You can also provide one explicitly:
$ raft check-manifest \
--manifest my_manifest.tsv \
--manifest-ui-template path/to/workflow.manifest-ui.json
The manifest UI rules cover table structure and per-row values. raft
check-manifest still performs CLI-only cross-row checks, including duplicate
sample types per patient/group and HLA consistency.
Optional columns
RAFT also recognizes several optional columns used by some workflows and UI paths:
Group: groups related samples or timepoints for the same patientAlleles: stores HLA alleles for workflows and viewers that use class-I HLA information
Group may contain one group name or multiple group names separated with
hyphens. When Group is absent, RAFT treats all rows as belonging to one
default group for validation.
Run-name prefixes
RAFT uses the prefix of Run_Name to infer sample type:
ad-: abnormal DNAar-: abnormal RNAnd-: normal DNAnr-: normal RNA
RAFT checks that these prefixes agree with the Normal column.
The current prefix rules are:
ad-andar-rows must haveNormalset toFALSEnd-andnr-rows must haveNormalset toTRUEany other prefix is rejected by
raft check-manifest
HLA alleles
When present, the Alleles column must contain six comma-separated class-I
HLA alleles:
HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01
RAFT normalizes allele values by trimming whitespace, uppercasing, and removing
* characters. Each allele must match HLA-A, HLA-B, or HLA-C
with two field resolution, such as HLA-A01:01. Each row must include
exactly two HLA-A, two HLA-B, and two HLA-C alleles.
Per-patient consistency rules:
If alleles are provided for ordinary multi-sample rows, all rows for that patient must provide the same allele set.
If alleles are provided only on
ar-rows, that is allowed.If a patient has multiple
ar-rows and uses thear--only exception, everyar-row must include alleles.Multiple
ar-rows with alleles must use the same allele set.
Example manifest
Patient_Name Run_Name Dataset File_Prefix Sequencing_Method Normal Group Alleles
Pt01 ad-Pt01-pre AML Pt01_pre_dna WES FALSE pre HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01
Pt01 nd-Pt01-pre AML Pt01_norm_dna WES TRUE pre HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01
Pt01 ar-Pt01-pre AML Pt01_pre_rna RNA-Seq FALSE pre HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01
Pt01 ar-Pt01-post AML Pt01_post_rna RNA-Seq FALSE post HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01
Validation behavior
raft check-manifest currently checks for:
missing required columns
tab-delimited formatting
allowed values from
manifest-ui.jsonfor columns such asSequencing_MethodandNormalvalid run-name prefixes from
manifest-ui.jsonsample typesconsistency between sample-type prefixes and
Normalwhen sample types define anormalvalueduplicate sample types within a patient, or within a patient and group
optional HLA allele formatting and consistency checks when
Allelesis present
Rows with a blank File_Prefix are excluded from patient and group
sample-count checks and are reported separately.
Without a workflow-specific manifest-ui.json, raft manifest-template
includes required columns plus the optional Group and Alleles columns
because those optional columns are useful for LENS and other immunology
workflows. Delete optional columns only if the target workflow does not use
them.
Validation accepts an absolute path, a path relative to the current RAFT workspace, or a file/path under the configured RAFT metadata directory. For example, all of these are valid when the files exist:
$ raft check-manifest --manifest /path/to/raft/projects/my-project/inputs/metadata/my_manifest.tsv
$ raft check-manifest --manifest projects/my-project/inputs/metadata/my_manifest.tsv
$ raft check-manifest --manifest my_manifest.tsv
The final form searches the configured metadata directory recursively. If multiple files with the requested name are found, use a more specific path.
Using manifests with workflows
A manifest can be supplied directly to raft run:
$ raft run \
--project-id my-project \
--workflow lens \
--version v1.9-dev \
--manifest my_manifest.tsv
If you omit --manifest when running raft run on a new project, RAFT can
launch the manifest generator UI instead.
Next steps
Browser-based UI tools for the manifest generator and other browser-based tools
Choosing a workflow for workflow selection
Outputs for the structure of workflow outputs