ZipStrain Command Line Interface
This page is organized by workflow area for easier navigation:
General usage:
zipstrain --help
For command-specific help:
zipstrain <command-or-group> --help
zipstrain <group> <command> --help
Profile
Profile Commands At A Glance
| Command | Purpose |
|---|---|
zipstrain profile |
Batch profiling for multiple BAM files |
zipstrain utilities prepare_profiling |
Build profiling assets (BED, gene ranges, genome lengths) |
zipstrain utilities profile-single |
Profile one BAM file |
zipstrain profile
Run BAM profiling in batch mode.
zipstrain profile \
--input-table samples.csv \
--stb-file mapping.stb \
--gene-range-table gene_range_table.tsv \
--bed-file genomes_bed_file.bed \
--genome-length-file genome_lengths.parquet \
--run-dir profile_run
Options:
-i, --input-table(required)-s, --stb-file(required)-g, --gene-range-table(required)-b, --bed-file(required)-l, --genome-length-file(required)-r, --run-dir(required)-n, --num-procs(default:8)-m, --max-concurrent-batches(default:5)-p, --poll-interval(default:1)-e, --execution-mode(default:local)-c, --slurm-config-o, --container-engine(default:local)-t, --task-per-batch(default:10)
zipstrain utilities prepare_profiling
Prepare profiling database assets.
zipstrain utilities prepare_profiling \
--reference-fasta reference.fasta \
--gene-fasta genes.fasta \
--stb-file mapping.stb \
--output-dir profiling_assets
Options:
-r, --reference-fasta(required)-g, --gene-fasta(required)-s, --stb-file(required)-o, --output-dir(required)
zipstrain utilities profile-single
Profile a single BAM.
zipstrain utilities profile-single \
--bed-file genomes_bed_file.bed \
--bam-file sample.bam \
--stb-file mapping.stb \
--null-model null_model.parquet \
--gene-range-table gene_range_table.tsv \
--output-dir sample_profile
Options:
-b, --bed-file(required)-a, --bam-file(required)-s, --stb-file(required)-m, --null-model(required)-g, --gene-range-table(required)-n, --num-workers(default:1)-o, --output-dir(required)
Outputs include:
<sample>_profile.parquet<sample>_genome_stats.parquet<sample>_gene_stats.parquet
Comparison
Comparison Commands At A Glance
| Command | Purpose |
|---|---|
zipstrain compare genomes |
Batch genome-level comparisons |
zipstrain compare genes |
Batch gene-level comparisons |
zipstrain compare build-comp-database |
Build comparison DB object from profile DB + config |
zipstrain utilities single_compare_genome |
Compare one pair at genome level |
zipstrain utilities single_compare_gene |
Compare one pair at gene level |
zipstrain utilities build-genome-comparison-config |
Build genome comparison config |
zipstrain utilities build-gene-comparison-config |
Build gene comparison config |
zipstrain utilities to-complete-table |
Emit not-yet-completed pair table |
zipstrain compare genomes
zipstrain compare genomes \
--genome-comparison-object genome_comp.json \
--run-dir compare_run \
--engine duckdb \
--calculate all
Options:
-g, --genome-comparison-object(required)-r, --run-dir(required)-m, --max-concurrent-batches(default:5)-p, --poll-interval(default:1)-e, --execution-mode(default:local)-s, --slurm-config-c, --container-engine(default:local)-t, --task-per-batch(default:10)--engine(polars|duckdb, default:polars)--calculate(ani,ibs,identical_genes,all, default:all)-d, --duckdb-memory-limit--duckdb-threads
zipstrain compare genes
zipstrain compare genes \
--gene-comparison-object gene_comp.json \
--run-dir gene_compare_run
Options:
-g, --gene-comparison-object(required)-r, --run-dir(required)-m, --max-concurrent-batches(default:5)-p, --poll-interval(default:1)-e, --execution-mode(default:local)-s, --slurm-config-c, --container-engine(default:local)-t, --task-per-batch(default:10)-n, --ani-method(default:popani)--engine(polars|duckdb, default:polars)-d, --duckdb-memory-limit--duckdb-threads
zipstrain compare build-comp-database
zipstrain compare build-comp-database \
--profile-db-dir profiles.parquet \
--config-file comparison_config.json \
--output-dir comparison_db
Options:
-p, --profile-db-dir(required)-c, --config-file(required)-o, --output-dir(required)-f, --comp-db-file
zipstrain utilities single_compare_genome
zipstrain utilities single_compare_genome \
--mpileup-contig-1 sample_a.parquet \
--mpileup-contig-2 sample_b.parquet \
--stb-file mapping.stb \
--output-file out.parquet
Options:
-m1, --mpileup-contig-1(required)-m2, --mpileup-contig-2(required)-s, --stb-file(required)-c, --min-cov(default:5)-l, --min-gene-compare-len(default:100)-o, --output-file(required)-g, --genome(default:all)-a, --ani-method(default:popani)--calculate(default:all)--engine(polars|duckdb, default:polars)--duckdb-memory-limit--duckdb-temp-directory--duckdb-threads
zipstrain utilities single_compare_gene
zipstrain utilities single_compare_gene \
--mpileup-contig-1 sample_a.parquet \
--mpileup-contig-2 sample_b.parquet \
--stb-file mapping.stb \
--scope all:all \
--output-file out.parquet
Options:
-m1, --mpileup-contig-1(required)-m2, --mpileup-contig-2(required)-s, --stb-file(required)-c, --min-cov(default:5)-l, --min-gene-compare-len(default:100)-o, --output-file(required)-g, --scope(default:all:all)-a, --ani-method(default:popani)--engine(polars|duckdb, default:polars)--duckdb-memory-limit--duckdb-temp-directory--duckdb-threads
Comparison Config Helpers
build-genome-comparison-config and build-gene-comparison-config share the same option pattern:
-p, --profile-db(required)-g, --gene-db-id(required)-r, --reference-genome-id(required)-s, --scope(default:allfor genome,all:allfor gene)-c, --min-cov(default:5)-l, --min-gene-compare-len(default:200)-t, --stb-file-loc(required)-a, --current-comp-table-o, --output-file(required)
zipstrain utilities to-complete-table
zipstrain utilities to-complete-table \
--genome-comparison-object genome_comp.json \
--output-file remaining_pairs.csv
Utilities
Utility Commands At A Glance
| Command | Purpose |
|---|---|
zipstrain utilities build-null-model |
Build sequencing-error null model |
zipstrain utilities merge_parquet |
Merge parquet files |
zipstrain utilities process_mpileup |
Convert mpileup stream to parquet |
zipstrain utilities make_bed |
Build bed chunks from fasta |
zipstrain utilities get_genome_lengths |
Genome lengths from STB + BED |
zipstrain utilities genome_breadth_matrix |
Per-genome breadth output |
zipstrain utilities collect_breadth_tables |
Merge breadth tables |
zipstrain utilities strain_heterogeneity |
Strain heterogeneity metrics |
zipstrain utilities build-profile-db |
Build profile DB parquet |
zipstrain utilities build-genome-db |
Build local genome reference bundle from abundance table |
zipstrain utilities presence-profile |
Presence profile from coverage + read locations |
zipstrain utilities process-read-locs |
Process read-location stream |
zipstrain utilities generate_stb |
Create scaffold-to-genome map from genome files |
zipstrain utilities gene-range-table |
Create gene range table |
zipstrain utilities gene-loc-table |
Create gene-location table for scaffold list |
zipstrain test |
Validate local installation/dependencies |
zipstrain utilities build-genome-db
zipstrain utilities build-genome-db \
--tool sylph \
--abundance-table sylph_abundance.tsv \
--cache-dir genome_cache \
--output-dir .
Important options:
--download-retries(default:3)--retry-backoff-seconds(default:1.0)--download-workers(default:4)
Other Utility Commands
Use --help on each command for full option details:
zipstrain utilities build-null-model --help
zipstrain utilities merge_parquet --help
zipstrain utilities process_mpileup --help
zipstrain utilities make_bed --help
zipstrain utilities get_genome_lengths --help
zipstrain utilities genome_breadth_matrix --help
zipstrain utilities collect_breadth_tables --help
zipstrain utilities strain_heterogeneity --help
zipstrain utilities build-profile-db --help
zipstrain utilities presence-profile --help
zipstrain utilities process-read-locs --help
zipstrain utilities generate_stb --help
zipstrain utilities gene-range-table --help
zipstrain utilities gene-loc-table --help
zipstrain test --help