Manifest Specifications

Purpose

A RAFT manifest describes the samples a workflow should process and the relationships among datasets, patients, runs, and input file prefixes.

For most users, a manifest is a tab-delimited table with one row per run. A typical paired tumor-normal example looks like:

Patient_Name  Run_Name       Dataset   File_Prefix       Sequencing_Method  Normal
Pt01          ad-Pt01        MyStudy   Pt01_tumor_dna    WES                FALSE
Pt01          nd-Pt01        MyStudy   Pt01_normal_dna   WES                TRUE

The Run_Name prefix (ad-, nd-, ar-, nr-) tells RAFT the sample type. See Run-name prefixes below for the full list.

RAFT validates manifests before execution with:

$ raft check-manifest --manifest my_manifest.tsv

Making a manifest

When you launch raft run on a new project without --manifest, RAFT can open the manifest generator UI and let you build a manifest interactively.

If you want to prepare a manifest ahead of time, you can use the hosted browser-based manifest generator. That interface can export either:

  • a plain-text TSV file

  • an encoded string that can be pasted back into RAFT when prompted

For command-line work, raft manifest-template prints a starter TSV:

$ raft manifest-template

Write it directly to a file with:

$ raft manifest-template --output my_manifest.tsv

Use --no-examples when you only want the header row:

$ raft manifest-template --no-examples --output my_manifest.tsv

Workflow owners can also provide a manifest-ui.json file that defines the columns, allowed values, and sample-type prefixes for their workflow. Use that file when you want the CLI template to match a workflow-specific manifest UI:

$ raft manifest-template \
    --manifest-ui-template path/to/workflow.manifest-ui.json \
    --output my_manifest.tsv

If you create a manifest outside RAFT, pass it to raft run with --manifest.

Technical details

This section documents the current RAFT manifest format.

Required columns

By default, RAFT requires these tab-delimited column names:

Required RAFT columns

Column

Description

Allowed values

Patient_Name

Patient identifier

Free text

Dataset

Dataset or cohort identifier

Free text

Run_Name

Sample or run identifier

Free text with a RAFT sample prefix

Sequencing_Method

Sequencing protocol

RNA-Seq, RNA, WES, WXS, WGS

Normal

Whether the sample is normal

TRUE or FALSE

File_Prefix

Base name or path prefix for input data

Free text

The header names are underscore-delimited. Use the exact names shown above.

For workflow-specific manifests, raft check-manifest can use a manifest-ui.json file as the source for required columns, allowed values, and sample-type prefix rules. When the manifest lives inside a RAFT project, check-manifest auto-detects a single workflow *.manifest-ui.json file under that project. You can also provide one explicitly:

$ raft check-manifest \
    --manifest my_manifest.tsv \
    --manifest-ui-template path/to/workflow.manifest-ui.json

The manifest UI rules cover table structure and per-row values. raft check-manifest still performs CLI-only cross-row checks, including duplicate sample types per patient/group and HLA consistency.

Optional columns

RAFT also recognizes several optional columns used by some workflows and UI paths:

  • Group: groups related samples or timepoints for the same patient

  • Alleles: stores HLA alleles for workflows and viewers that use class-I HLA information

Group may contain one group name or multiple group names separated with hyphens. When Group is absent, RAFT treats all rows as belonging to one default group for validation.

Run-name prefixes

RAFT uses the prefix of Run_Name to infer sample type:

  • ad-: abnormal DNA

  • ar-: abnormal RNA

  • nd-: normal DNA

  • nr-: normal RNA

RAFT checks that these prefixes agree with the Normal column.

The current prefix rules are:

  • ad- and ar- rows must have Normal set to FALSE

  • nd- and nr- rows must have Normal set to TRUE

  • any other prefix is rejected by raft check-manifest

HLA alleles

When present, the Alleles column must contain six comma-separated class-I HLA alleles:

HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01

RAFT normalizes allele values by trimming whitespace, uppercasing, and removing * characters. Each allele must match HLA-A, HLA-B, or HLA-C with two field resolution, such as HLA-A01:01. Each row must include exactly two HLA-A, two HLA-B, and two HLA-C alleles.

Per-patient consistency rules:

  • If alleles are provided for ordinary multi-sample rows, all rows for that patient must provide the same allele set.

  • If alleles are provided only on ar- rows, that is allowed.

  • If a patient has multiple ar- rows and uses the ar--only exception, every ar- row must include alleles.

  • Multiple ar- rows with alleles must use the same allele set.

Example manifest

Patient_Name  Run_Name        Dataset File_Prefix     Sequencing_Method       Normal  Group   Alleles
Pt01  ad-Pt01-pre     AML     Pt01_pre_dna    WES     FALSE   pre     HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01
Pt01  nd-Pt01-pre     AML     Pt01_norm_dna   WES     TRUE    pre     HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01
Pt01  ar-Pt01-pre     AML     Pt01_pre_rna    RNA-Seq FALSE   pre     HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01
Pt01  ar-Pt01-post    AML     Pt01_post_rna   RNA-Seq FALSE   post    HLA-A01:01,HLA-A02:01,HLA-B08:01,HLA-B27:05,HLA-C01:02,HLA-C07:01

Validation behavior

raft check-manifest currently checks for:

  • missing required columns

  • tab-delimited formatting

  • allowed values from manifest-ui.json for columns such as Sequencing_Method and Normal

  • valid run-name prefixes from manifest-ui.json sample types

  • consistency between sample-type prefixes and Normal when sample types define a normal value

  • duplicate sample types within a patient, or within a patient and group

  • optional HLA allele formatting and consistency checks when Alleles is present

Rows with a blank File_Prefix are excluded from patient and group sample-count checks and are reported separately.

Without a workflow-specific manifest-ui.json, raft manifest-template includes required columns plus the optional Group and Alleles columns because those optional columns are useful for LENS and other immunology workflows. Delete optional columns only if the target workflow does not use them.

Validation accepts an absolute path, a path relative to the current RAFT workspace, or a file/path under the configured RAFT metadata directory. For example, all of these are valid when the files exist:

$ raft check-manifest --manifest /path/to/raft/projects/my-project/inputs/metadata/my_manifest.tsv
$ raft check-manifest --manifest projects/my-project/inputs/metadata/my_manifest.tsv
$ raft check-manifest --manifest my_manifest.tsv

The final form searches the configured metadata directory recursively. If multiple files with the requested name are found, use a more specific path.

Using manifests with workflows

A manifest can be supplied directly to raft run:

$ raft run \
    --project-id my-project \
    --workflow lens \
    --version v1.9-dev \
    --manifest my_manifest.tsv

If you omit --manifest when running raft run on a new project, RAFT can launch the manifest generator UI instead.

Next steps