A Deep learning based approach for simultaneous host extraction and multi-class classification

Loading...
Thumbnail Image

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

Understanding microorganisms’ behavioral patterns and functions in different environments is crucial to clarify their impact on human health, environmental sustainability, etc. Metagenomics which is facilitated by evolved computational techniques, analyzes genetic material directly from environmental samples. It overcomes the need for culturing individual organisms. However, it poses computational challenges due to heterogeneous datasets. One major problem is host DNA overshadowing microbial DNA, affecting downstream analysis quality. In addition, existing tools are often optimized for specific microorganisms, necessitating multi-class classification tools. The proposed tool is a CNN-based approach that addresses these challenges by separating host sequences and classifying microbial samples into five classes: bacteria, fungi, archaea, protozoa, and viruses. It also allows users to fine-tune the model with a new host, if needed, and optimize host extraction. The proposed tool has outperformed past literature, as evidenced by our evaluation results.

Description

Citation

DOI

Collections

Endorsement

Review

Supplemented By

Referenced By