14 August 2012:

(tested with Nesoni 0.80)


Short read assembler for assembling antigenic variant sequences
in bacteria.

The input should be high-throughput sequencing reads of DNA obtained
by PCR amplification of a small region (several kilobases) that is
believed to undergo antigenic variation.


- Python 2, version 2.6 or higher
  (some components work much faster in PyPy)

- nesoni

- matplotlib 
  (only needed for some tools)

Example usage

mkdir output

nesoni clip: output/negative --match 1 --length 50 pairs: data/negative_R?.fastq.gz

pypy new: output/negative --k 50

pypy load: output/negative \
    interleaved: output/negative_paired.fq.gz \
    reads: output/negative_single.fq.gz

nesoni clip: output/positive --match 1 --length 50 pairs: data/positive_R?.fastq.gz

pypy new: output/positive --k 50

pypy load: output/positive \
    interleaved: output/positive_paired.fq.gz \
    reads: output/positive_single.fq.gz

pypy assemble: output/assembly \
    samples: output/negative output/positive

pypy select: output/selection output/assembly.fa data/reference.fa \
    samples: output/negative output/positive

python validate: output/validation output/selection.fa \
    samples: output/negative output/positive

python quantify: output/quantification output/selection.fa \
    samples: output/negative output/positive

# If you have zero interest in SNPs:

pypy assemble: --skip-snps yes \
    output/assembly-nosnp \
    samples: output/negative output/positive

pypy select: --snp-weight 0.0 \
    output/selection-nosnp output/assembly-nosnp.fa data/reference.fa \
    samples: output/negative output/positive