Synopsis: Wombac rapidly finds core genome SNPs from samples and produces an alignment of those SNPs which can be used to build a phylogenomic tree. It can handle 100s of samples and uses multiple CPUs on a single system efficiently. Computations can re-used for building new trees when new samples are added, saving lots of time. Wombac only looks for substitution SNPs, not indels, and it may miss some SNPs, but it will find enough to build high-resolution trees.
Input: Snippy needs a reference genome in FASTA format (can be in multiple contigs) and a series of samples. A sample can either be:
- a folder containing FASTQ short reads: eg. R1.fq.fz R2.fq.gz
- a multi-FASTA file: eg. contigs.fa or NC_273461.fna
- a .tar.gz file containing FASTA contig files: eg. Ecoli_K12mut.contig.tar.gz (from EBI/NCBI)
Output: Wombac produces standards-compliant output files: BAM, VCF (per sample) and an overall .ALN (FASTA aligned core SNPs).
Downloadwombac-2.0.tar.gz - 27 Jan 2015 - GitHub
% ls -R K12.fna EcPoo.fasta EHEC.contigs.fa UPEC/R1.fq.gz UPEC/R2.fq.gz EPEC/R1.fastq EPEC/R2.fastq APEC/s_1_sequence.txt K12mut.contigs.tar.gz % wombac --outdir Tree --ref K12.fna --run EcPoo.fasta EHEC.contigs.fa UPEC/ EPEC/ APEC/ K12mut.contigs.tar.gz (wait a while) % SplitsTree -i Tree/core.aln (play with tree)