Skip to content

Quick Reference Guide

Fast reference for common commands and configurations.

One-Line Commands

Basic Run

# Simplest possible run (auto-detects mode)
nextflow run seqeralabs/nf-proteindesign -profile docker --input samplesheet.csv --outdir results

With Analysis

# Include affinity prediction and scoring
nextflow run seqeralabs/nf-proteindesign -profile docker --input samplesheet.csv --outdir results --run_prodigy --run_ipsae

Resume Failed Run

# Resume from where it stopped
nextflow run seqeralabs/nf-proteindesign -profile docker --input samplesheet.csv --outdir results -resume

Samplesheet Template

sample_id,design_yaml,structure_files,protocol,num_designs,budget
design1,designs/my_design.yaml,data/target.pdb,protein-anything,100,10
design2,designs/another_design.yaml,data/target.cif,peptide-anything,100,10

Required columns: - sample_id: Unique identifier for the design - design_yaml: Path to Boltzgen design YAML specification

Optional columns: - structure_files: Additional structure files (comma-separated if multiple) - protocol: Boltzgen protocol (protein-anything, peptide-anything, nanobody-anything, protein-small_molecule) - num_designs: Number of intermediate designs (default: 100) - budget: Number of final diversity-optimized designs (default: 10)

Common Parameters

Essential Parameters

Parameter Description Default Example
--input Samplesheet path Required samplesheet.csv
--outdir Output directory ./results results/
--protocol Boltzgen protocol protein-anything peptide-anything

Design Parameters

Parameter Description Default Example
--num_designs Intermediate designs 100 50
--budget Final optimized designs 10 20
--cache_dir Model cache directory null /cache

Analysis Parameters

Parameter Description Default Example
--run_proteinmpnn Enable ProteinMPNN false true
--run_ipsae Enable IPSAE scoring false true
--run_prodigy Enable PRODIGY false true
--run_consolidation Consolidated report false true

Resource Parameters

Parameter Description Default Example
--max_cpus Maximum CPUs 16 32
--max_memory Maximum memory 128.GB 256.GB
--max_time Maximum time 240.h 72.h

Command Recipes

Quick Test

nextflow run seqeralabs/nf-proteindesign \
    -profile test_design_protein,docker \
    --outdir test_results

Standard Run

nextflow run seqeralabs/nf-proteindesign \
    -profile docker \
    --input samplesheet.csv \
    --outdir results

With Analysis Tools

nextflow run seqeralabs/nf-proteindesign \
    -profile docker \
    --input samplesheet.csv \
    --outdir results \
    --run_proteinmpnn \
    --run_ipsae \
    --run_prodigy \
    --run_consolidation

Peptide Design

nextflow run seqeralabs/nf-proteindesign \
    -profile docker \
    --input peptide_samplesheet.csv \
    --protocol peptide-anything \
    --outdir peptide_designs

Nanobody Design

nextflow run seqeralabs/nf-proteindesign \
    -profile docker \
    --input nanobody_samplesheet.csv \
    --protocol nanobody-anything \
    --outdir nanobody_designs

Output Structure

results/
├── {sample}/
│   ├── boltzgen/
│   │   ├── final_ranked_designs/    ← Your final designs
│   │   │   ├── design_1.cif
│   │   │   ├── design_2.cif
│   │   │   └── ...
│   │   ├── intermediate_designs/
│   │   └── boltzgen.log
│   ├── prodigy/
│   │   ├── design_1_prodigy_summary.csv
│   │   └── ...
│   └── ipsae/
│       └── design_1_ipsae_scores.csv
└── pipeline_info/
    ├── execution_report.html        ← Check this first
    ├── execution_timeline.html
    └── execution_trace.txt

Troubleshooting Quick Fixes

GPU Not Found

# Test GPU access
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

# If fails, install nvidia-container-toolkit
# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

Out of Memory

# Reduce parallel samples
nextflow run ... --n_samples 10  # Lower value

# Increase available memory
nextflow run ... --max_memory 64.GB

Pipeline Fails Mid-Run

# Resume from last checkpoint
nextflow run seqeralabs/nf-proteindesign \
    -profile docker \
    --input samplesheet.csv \
    --outdir results \
    -resume  # ← Add this flag

Container Pull Issues

# Pre-pull containers
docker pull ghcr.io/flouwuenne/boltzgen:latest
docker pull ghcr.io/flouwuenne/prodigy:latest

Design YAML Template

design_template.yaml
# Boltzgen design specification
entities:
  # Designed protein entity
  - protein:
      id: C
      sequence: 80..120  # Length range for designed protein

  # Target structure entity
  - file:
      path: target_protein.cif
      include:
        - chain:
            id: A  # Target chain to bind

See the Boltzgen documentation for complete YAML specification details.

Performance Estimates

Configuration num_designs budget Time (1 GPU) GPU Memory
Quick test 5-10 2-5 5-10 min 8GB
Standard 50-100 10 30-60 min 16GB
Production 100-200 20 1-3 hours 16-24GB
Large campaign 200+ 50+ 4-12 hours 24GB+

Useful Commands

Check Pipeline Status

# List running processes
nextflow log

# View specific run
nextflow log <run_name> -f workdir,status,exit

# Clean work directory
nextflow clean -f

Monitor Resources

# Watch GPU usage
watch -n 1 nvidia-smi

# Check disk usage
du -sh results/ work/

# Monitor memory
free -h

Analyze Results

# Count final designs
find results/ -name "*.cif" -path "*/final_ranked_designs/*" | wc -l

# Find best PRODIGY scores
cat results/*/prodigy/*_summary.csv | \
    grep -v "sample_id" | \
    sort -t',' -k3,3n | \
    head -5

# Check pipeline status
grep "Succeeded" results/pipeline_info/execution_trace.txt | wc -l

Bookmark This Page

This quick reference covers 90% of common use cases. Keep it handy!