MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes
- PMID: 27065984
- PMCID: PMC4814491
- DOI: 10.3389/fmicb.2016.00428
MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes
Abstract
The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a 'dark matter.' We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about the presence and prevalence of giant viruses in the environment and the human body.
Keywords: Megavirales; bioinformatics; giant virus; metagenomes; mimivirus; pipeline.
Figures
Similar articles
-
A New Zamilon-like Virophage Partial Genome Assembled from a Bioreactor Metagenome.Front Microbiol. 2015 Nov 27;6:1308. doi: 10.3389/fmicb.2015.01308. eCollection 2015. Front Microbiol. 2015. PMID: 26640459 Free PMC article.
-
Giant Viruses of Amoebas: An Update.Front Microbiol. 2016 Mar 22;7:349. doi: 10.3389/fmicb.2016.00349. eCollection 2016. Front Microbiol. 2016. PMID: 27047465 Free PMC article. Review.
-
Giant virus-related sequences in the 5300-year-old Ötzi mummy metagenome.Virus Genes. 2021 Apr;57(2):222-227. doi: 10.1007/s11262-021-01823-2. Epub 2021 Feb 10. Virus Genes. 2021. PMID: 33566217
-
Pithovirus sibericum, a new bona fide member of the "Fourth TRUC" club.Front Microbiol. 2015 Aug 4;6:722. doi: 10.3389/fmicb.2015.00722. eCollection 2015. Front Microbiol. 2015. PMID: 26300849 Free PMC article.
-
Mimivirus: leading the way in the discovery of giant viruses of amoebae.Nat Rev Microbiol. 2017 Apr;15(4):243-254. doi: 10.1038/nrmicro.2016.197. Epub 2017 Feb 27. Nat Rev Microbiol. 2017. PMID: 28239153 Free PMC article. Review.
Cited by
-
Giant virus biology and diversity in the era of genome-resolved metagenomics.Nat Rev Microbiol. 2022 Dec;20(12):721-736. doi: 10.1038/s41579-022-00754-5. Epub 2022 Jul 28. Nat Rev Microbiol. 2022. PMID: 35902763 Review.
-
Computational Tools for the Analysis of Uncultivated Phage Genomes.Microbiol Mol Biol Rev. 2022 Jun 15;86(2):e0000421. doi: 10.1128/mmbr.00004-21. Epub 2022 Mar 21. Microbiol Mol Biol Rev. 2022. PMID: 35311574 Free PMC article. Review.
-
Morphological and Genomic Features of the New Klosneuvirinae Isolate Fadolivirus IHUMI-VV54.Front Microbiol. 2021 Sep 21;12:719703. doi: 10.3389/fmicb.2021.719703. eCollection 2021. Front Microbiol. 2021. PMID: 34621250 Free PMC article.
-
ViralRecall-A Flexible Command-Line Tool for the Detection of Giant Virus Signatures in 'Omic Data.Viruses. 2021 Jan 20;13(2):150. doi: 10.3390/v13020150. Viruses. 2021. PMID: 33498458 Free PMC article.
-
Fifteen Marseilleviruses Newly Isolated From Three Water Samples in Japan Reveal Local Diversity of Marseilleviridae.Front Microbiol. 2019 May 24;10:1152. doi: 10.3389/fmicb.2019.01152. eCollection 2019. Front Microbiol. 2019. PMID: 31178850 Free PMC article.
References
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials