Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Mar 31:7:428.
doi: 10.3389/fmicb.2016.00428. eCollection 2016.

MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes

Affiliations

MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes

Jonathan Verneau et al. Front Microbiol. .

Abstract

The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a 'dark matter.' We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about the presence and prevalence of giant viruses in the environment and the human body.

Keywords: Megavirales; bioinformatics; giant virus; metagenomes; mimivirus; pipeline.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Flowchart of the MG-Digger tool.
FIGURE 2
FIGURE 2
Screenshot of MG-Digger graphical user interface.
FIGURE 3
FIGURE 3
Distribution of Megavirales member- and virophage-related reads identified by MG-Digger in metagenomes generated from human sera and sewage (A; Loh et al., 2009), plasma from patients with liver diseases (B; Law et al., 2013) and water from the Indian Ocean (C; Williamson et al., 2012).
FIGURE 4
FIGURE 4
Comparison of the MG-Digger and METAVIR 2 results for the environmental metagenome dataset generated in Williamson et al.’s study (Williamson et al., 2012). (A) Venn diagram of number of reads identified as having a Megavirales member or virophage sequence as best hit by the two tools. (B) Distribution of hits according to the viral families found as best hit by the two tools.

Similar articles

Cited by

References

    1. Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215 403–410. 10.1016/S0022-2836(05)80360-2 - DOI - PubMed
    1. Bousbia S., Papazian L., Saux P., Forel J. M., Auffray J. P., Martin C., et al. (2013). Serologic prevalence of amoeba-associated microorganisms in intensive care unit pneumonia patients. PLoS ONE 8:e58111 10.1371/journal.pone.0058111 - DOI - PMC - PubMed
    1. Boyer M., Yutin N., Pagnier I., Barrassi L., Fournous G., Espinosa L., et al. (2009). Giant Marseillevirus highlights the role of amoebae as a melting pot in emergence of chimeric microorganisms. Proc. Natl. Acad. Sci. U.S.A 106 21848–21853. 10.1073/pnas.0911354106 - DOI - PMC - PubMed
    1. Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., et al. (2009). BLAST+: architecture and applications. BMC. Bioinform. 10:421 10.1186/1471-2105-10-421 - DOI - PMC - PubMed
    1. Colson P., de Lamballerie X., Fournous G., Raoult D. (2012a). Reclassification of giant viruses composing a fourth domain of life in the new order Megavirales. Intervirology 55 321–332. 10.1159/000336562 - DOI - PubMed