Pipeline Parameters¶
Auto-Generated Documentation
This page is automatically generated from nextflow_schema.json.
Parameter defaults and descriptions reflect the current pipeline version.
Overview¶
Pipeline: nf-proteindesign pipeline parameters
Nextflow pipeline for Boltzgen protein design using pre-made design YAML specifications
Input/output options¶
Define where the pipeline should find input data and save output data.
--input¶
Required. Path to comma-separated samplesheet file.
- Type:
string - Default:
"null" - Pattern:
^\S+\.csv$
--outdir¶
Required. The output directory where the results will be saved.
- Type:
string - Default:
"./results"
Boltzgen design parameters¶
Core parameters for Boltzgen protein design execution.
--cache_dir¶
Cache directory for model weights (~6GB).
- Type:
string - Default:
"null"
--boltzgen_config¶
Optional path to custom Boltzgen config YAML to override defaults.
- Type:
string - Default:
"null"
--steps¶
Optional comma-separated list of steps to run (e.g., 'filtering' to rerun only filtering).
- Type:
string - Default:
"null"
ProteinMPNN sequence optimization¶
Options for ProteinMPNN sequence optimization of designed structures.
--run_proteinmpnn¶
Enable ProteinMPNN sequence optimization of Boltzgen designs.
- Type:
boolean - Default:
false
--mpnn_sampling_temp¶
Sampling temperature (lower = more conservative).
- Type:
number - Default:
0.1
--mpnn_num_seq_per_target¶
Number of sequence variants to generate per structure.
- Type:
integer - Default:
8
--mpnn_batch_size¶
Batch size for ProteinMPNN inference.
- Type:
integer - Default:
1
--mpnn_seed¶
Random seed for reproducibility.
- Type:
integer - Default:
37
--mpnn_backbone_noise¶
Backbone noise level (lower = more faithful to input).
- Type:
number - Default:
0.02
--mpnn_save_score¶
Save per-residue scores.
- Type:
boolean - Default:
true
--mpnn_save_probs¶
Save per-residue probabilities (large files, use for detailed analysis).
- Type:
boolean - Default:
false
--mpnn_fixed_chains¶
Chains to keep fixed (e.g., 'A,B' - typically the target chains).
- Type:
string - Default:
"null"
--mpnn_designed_chains¶
Chains to design (e.g., 'C' - typically the binder chain).
- Type:
string - Default:
"null"
Boltz-2 structure prediction¶
Options for Boltz-2 multimer structure prediction of ProteinMPNN sequences.
--run_boltz2_refold¶
Enable Boltz-2 structure prediction for ProteinMPNN sequences.
- Type:
boolean - Default:
false
--boltz2_num_diffusion¶
Number of diffusion samples per sequence (higher = more diversity).
- Type:
integer - Default:
200
--boltz2_num_recycling¶
Number of recycling iterations for structure refinement.
- Type:
integer - Default:
3
--boltz2_use_msa¶
Use multiple sequence alignments (MSAs) for prediction.
- Type:
boolean - Default:
false
--boltz2_predict_affinity¶
Predict binding affinity for protein complexes.
- Type:
boolean - Default:
true
Analysis and scoring options¶
Options for scoring and evaluating designed structures.
--run_ipsae¶
Enable IPSAE scoring of Boltzgen predictions.
- Type:
boolean - Default:
false
--ipsae_pae_cutoff¶
PAE cutoff for IPSAE calculation (Angstroms).
- Type:
number - Default:
10
--ipsae_dist_cutoff¶
Distance cutoff for CA-CA contacts (Angstroms).
- Type:
number - Default:
10
--run_prodigy¶
Enable PRODIGY binding affinity prediction on final designs.
- Type:
boolean - Default:
false
--prodigy_selection¶
Chain selection for PRODIGY (e.g., 'A,B'). If null, auto-detects from structure.
- Type:
string - Default:
"null"
--run_foldseek¶
Enable Foldseek structural similarity search for budget designs and Boltz-2 structures.
- Type:
boolean - Default:
false
--foldseek_database¶
Path to Foldseek database directory (required if run_foldseek is true).
- Type:
string - Default:
"null"
--foldseek_database_name¶
Name of the Foldseek database within the directory.
- Type:
string - Default:
"afdb"
--foldseek_evalue¶
E-value threshold for reporting matches (lower = more stringent).
- Type:
number - Default:
0.001
--foldseek_max_seqs¶
Maximum number of target sequences to report per query.
- Type:
integer - Default:
100
--foldseek_sensitivity¶
Search sensitivity (1.0-9.5, higher = more sensitive but slower).
- Type:
number - Default:
9.5
--foldseek_coverage¶
Minimum fraction of aligned residues (0.0-1.0, higher = more global alignment).
- Type:
number - Default:
0.0
--foldseek_alignment_type¶
Alignment type: 0=3Di only, 1=TMalign (global), 2=3Di+AA (local, default).
- Type:
integer - Default:
2 - Allowed values:
0,1,2
--run_consolidation¶
Enable consolidated metrics report generation.
- Type:
boolean - Default:
false
--report_top_n¶
Number of top designs to highlight in consolidated report.
- Type:
integer - Default:
10
Resource allocation¶
Maximum resource limits for pipeline execution.
--max_cpus¶
Maximum number of CPUs per process.
- Type:
integer - Default:
16
--max_memory¶
Maximum memory per process.
- Type:
string - Default:
"128.GB" - Pattern:
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
--max_time¶
Maximum time per process.
- Type:
string - Default:
"240.h" - Pattern:
^\d+(\.\d+)?\.?\s*(m|h|d|s)?$
--max_gpus¶
Maximum number of GPUs per process.
- Type:
integer - Default:
1
Generic options¶
Less common options for the pipeline, typically set in a config file.
--publish_dir_mode¶
Method for publishing outputs.
- Type:
string - Default:
"copy" - Allowed values:
copy,symlink,move
--tracedir¶
Directory to store pipeline execution traces.
- Type:
string - Default:
"${params.outdir}/pipeline_info"
--validate_params¶
Validate parameters against the schema at runtime.
- Type:
boolean - Default:
true
--show_hidden_params¶
Show hidden parameters in help message.
- Type:
boolean - Default:
false
--help¶
Display help text.
- Type:
boolean - Default:
"null"
--version¶
Display version and exit.
- Type:
boolean - Default:
"null"
Quick Reference Table¶
| Parameter | Type | Default | Description |
|---|---|---|---|
--input |
string |
"null" |
**Required |
--outdir |
string |
"./results" |
**Required |
--cache_dir |
string |
"null" |
Cache directory for model weights (~6GB) |
--boltzgen_config |
string |
"null" |
Optional path to custom Boltzgen config YAML to... |
--steps |
string |
"null" |
Optional comma-separated list of steps to run (e |
--run_proteinmpnn |
boolean |
false |
Enable ProteinMPNN sequence optimization of Bol... |
--mpnn_sampling_temp |
number |
0.1 |
Sampling temperature (lower = more conservative) |
--mpnn_num_seq_per_target |
integer |
8 |
Number of sequence variants to generate per str... |
--mpnn_batch_size |
integer |
1 |
Batch size for ProteinMPNN inference |
--mpnn_seed |
integer |
37 |
Random seed for reproducibility |
--mpnn_backbone_noise |
number |
0.02 |
Backbone noise level (lower = more faithful to ... |
--mpnn_save_score |
boolean |
true |
Save per-residue scores |
--mpnn_save_probs |
boolean |
false |
Save per-residue probabilities (large files, us... |
--mpnn_fixed_chains |
string |
"null" |
Chains to keep fixed (e |
--mpnn_designed_chains |
string |
"null" |
Chains to design (e |
--run_boltz2_refold |
boolean |
false |
Enable Boltz-2 structure prediction for Protein... |
--boltz2_num_diffusion |
integer |
200 |
Number of diffusion samples per sequence (highe... |
--boltz2_num_recycling |
integer |
3 |
Number of recycling iterations for structure re... |
--boltz2_use_msa |
boolean |
false |
Use multiple sequence alignments (MSAs) for pre... |
--boltz2_predict_affinity |
boolean |
true |
Predict binding affinity for protein complexes |
--run_ipsae |
boolean |
false |
Enable IPSAE scoring of Boltzgen predictions |
--ipsae_pae_cutoff |
number |
10 |
PAE cutoff for IPSAE calculation (Angstroms) |
--ipsae_dist_cutoff |
number |
10 |
Distance cutoff for CA-CA contacts (Angstroms) |
--run_prodigy |
boolean |
false |
Enable PRODIGY binding affinity prediction on f... |
--prodigy_selection |
string |
"null" |
Chain selection for PRODIGY (e |
--run_foldseek |
boolean |
false |
Enable Foldseek structural similarity search fo... |
--foldseek_database |
string |
"null" |
Path to Foldseek database directory (required i... |
--foldseek_database_name |
string |
"afdb" |
Name of the Foldseek database within the directory |
--foldseek_evalue |
number |
0.001 |
E-value threshold for reporting matches (lower ... |
--foldseek_max_seqs |
integer |
100 |
Maximum number of target sequences to report pe... |
--foldseek_sensitivity |
number |
9.5 |
Search sensitivity (1 |
--foldseek_coverage |
number |
0.0 |
Minimum fraction of aligned residues (0 |
--foldseek_alignment_type |
integer |
2 |
Alignment type: 0=3Di only, 1=TMalign (global),... |
--run_consolidation |
boolean |
false |
Enable consolidated metrics report generation |
--report_top_n |
integer |
10 |
Number of top designs to highlight in consolida... |
--max_cpus |
integer |
16 |
Maximum number of CPUs per process |
--max_memory |
string |
"128.GB" |
Maximum memory per process |
--max_time |
string |
"240.h" |
Maximum time per process |
--max_gpus |
integer |
1 |
Maximum number of GPUs per process |
--publish_dir_mode |
string |
"copy" |
Method for publishing outputs |
--tracedir |
string |
"${params.outdir}/pipeline_info" |
Directory to store pipeline execution traces |
--validate_params |
boolean |
true |
Validate parameters against the schema at runtime |
--show_hidden_params |
boolean |
false |
Show hidden parameters in help message |
--help |
boolean |
"null" |
Display help text |
--version |
boolean |
"null" |
Display version and exit |