Pipelines Overview

This application provides several specialized probe design pipelines, each with its own configuration interface under /pipelines/*.
All pipelines share a consistent workflow: prepare inputs → configure parameters → submit job → track results.

Available Pipelines

SCRINSHOT — src/pages/scrinshot.tsx
Designs padlock probes with gene-specific 5’ and 3’ arms that circularize upon hybridization to detect and quantify RNA transcripts at single-cell resolution. These probes enable highly multiplexed and spatially resolved gene expression analysis in tissue samples.
MERFISH — src/pages/merfish.tsx
Designs encoding probes with unique barcodes that enable simultaneous imaging and identification of hundreds of different transcripts within a single sample. This highly multiplexed approach provides detailed, spatially resolved gene expression information at the single-cell level.
SeqFISH+ — src/pages/seqfish.tsx
Designs probes for sequential fluorescence in situ hybridization, enabling multiple rounds of hybridization and imaging to visualize and quantify hundreds of RNA targets in a single sample. This technique preserves spatial context while providing high-throughput and single-cell resolution.
Oligo-Seq — src/pages/oligoseq.tsx
Designs oligo hybridization probes optimized for probe-based targeted sequencing to measure RNA expression. These probes are specifically tailored for next-generation sequencing detection methods.

Common Features

Each pipeline page provides:

FASTA input
- Generate directly from genomic databases (NCBI / Ensembl)
- Or upload existing FASTA files from your computer
Multiple data sources per pipeline (e.g., target probes, reference databases, primers)
Advanced parameter controls for probe length, GC content, melting temperature, secondary structure, homopolymers, and more
Developer settings for fine-grained control over BLASTN/Bowtie parameters and thermodynamic calculations
Job submission
- Generates a unique Run ID via the helper API
- Sends all inputs and settings to the backend for processing

FASTA File Input Requirements

All pipelines require FASTA files as input. When uploading custom FASTA files, they must adhere to the following structure:

Header Format

Each sequence must have a header starting with the > character. The header should contain:

region_id: A unique identifier for the genomic region (e.g., gene name or ID). This is mandatory.
additional_information: Optional metadata fields such as transcript ID or exon number, separated by commas.
coordinates: Genomic location in the format chrom:start-end(strand), which is optional.

The header format uses double colons (::) as separators between the region ID, additional information, and coordinates.

Sequence Content

The sequence follows the header in standard FASTA format (single-letter nucleotide codes: A, T, G, C, N).

Examples

With all optional fields:

>ASR1::transcrip_id=XM456,exon_number=5::16:54552-54786(+)
AGTTGACAGACCCCAGATTAAAGTGTGTCGCGCAACAC

With only the mandatory region_id:

>ASR1
AGTTGACAGACCCCAGATTAAAGTGTGTCGCGCAACAC

Note: When using the Genomic Region Generator to create FASTA files from NCBI or Ensembl, the files are automatically formatted correctly. Only manually uploaded FASTA files need to follow this format.

Submission Workflow

Select and prepare inputs
Choose between generating FASTA files from NCBI/Ensembl or uploading them manually.
Some pipelines require multiple FASTA groups (e.g., MERFISH and SeqFISH require target, reference, and readout probe databases).
Configure parameters
Adjust basic, advanced, and developer-level settings to fit your experimental requirements.
Submit
On submission, the system:
- Creates a new Run ID
- Bundles all configuration values and file paths
- Sends them via a POST /api/<pipeline> request to the backend
Track progress
- Runs appear in the Runs page, linked by your session or account
- You can view logs in real time and download results from the Run Detail page

For detailed parameter explanations and backend processing steps, see the dedicated pages for: