Metagenomics classification and clinical diagnosis of brain infections

Florian Breitwieser
STAMPS 2016

logo logo

Summary

Metagenomics sequencing for the diagnosis of neuropathological infections
centrifuge: Novel classification engine for microbial sequences
pavian: Interface for analyzing metagenomics data (‘PAthogen VIsualization ANd more’)

Sequencing for diagnosis of infections



  • > 50% of infections remain undiagnosed
  • Encephalitis
    • about 20k cases / year in the U.S.
    • mortality rate > 5%
  • infectious causes viral, bacterial, fungal, parasites
Sequencing can enable fast and ‘unbiased’ identification of pathogens.

Potentially: Find unknown unknowns (zoonotic pathogens).

Sequencing diagnosis of neuropathological infections

  • Ten patients with suspected neuropathological infections
  • Negative results with standard methods
  • Brain or spinal cord biopsies

  • Whole metagenome sequencing on a MiSeq
  • Computational pipeline to analyze and compare samples

Metagenomics classification

Metagenomics classification

  • long runtime, or
  • require a very large index (50-100GB, e.g. Kraken, CLARK, kallisto), or
  • only part of the genome indexed (e.g. MetaPhlAn2, mOTU)
speed sensitivity mem req Aim: fast and sensitive engine that can run on a desktop

Centrifuge for microbial classification

  • full genomes, compressed on the species level
  • based on FM index
  • small database: < 4GB for all bacterial genomes
  • sensitivity and precision comparable to best programs
  • very fast

https://github.com/infphilo/centrifuge

Pan-Genome compression


Many species have several completed genomes:
  • 268 Salmonella enterica genomes
  • 134 Escherichia coli genomes
  • 112 Mycobacterium tubercolusis genomes