• Home
  • Advanced Search
  • Directory of Libraries
  • About lib.ir
  • Contact Us
  • History

عنوان
Efficient and Interpretable Machine Learning Algorithms for Predictive Analyses in Metagenomic Data

پدید آورنده
Rahman, Mohammad Arifur

موضوع
Artificial intelligence,Bioinformatics,Computer science,Epidemiology,Genetics,Microbiology

رده

کتابخانه
Center and Library of Islamic Studies in European Languages

محل استقرار
استان: Qom ـ شهر: Qom

Center and Library of Islamic Studies in European Languages

تماس با کتابخانه : 32910706-025

NATIONAL BIBLIOGRAPHY NUMBER

Number
TLpq2476542411

LANGUAGE OF THE ITEM

.Language of Text, Soundtrack etc
انگلیسی

TITLE AND STATEMENT OF RESPONSIBILITY

Title Proper
Efficient and Interpretable Machine Learning Algorithms for Predictive Analyses in Metagenomic Data
General Material Designation
[Thesis]
First Statement of Responsibility
Rahman, Mohammad Arifur
Subsequent Statement of Responsibility
Rangwala, Huzefa

.PUBLICATION, DISTRIBUTION, ETC

Name of Publisher, Distributor, etc.
George Mason University
Date of Publication, Distribution, etc.
2020

PHYSICAL DESCRIPTION

Specific Material Designation and Extent of Item
164

DISSERTATION (THESIS) NOTE

Dissertation or thesis details and type of degree
Ph.D.
Body granting the degree
George Mason University
Text preceding or following the note
2020

SUMMARY OR ABSTRACT

Text of Note
Advancements in DNA sequencing technologies have enabled the direct investigation of the microbiome. Microbiome refers to all the microorganisms i.e., bacteria and viruses, present as a community in a host. Researchers and clinicians have embarked on studying the role of these microorganisms concerning human health and diseases. Most existing approaches first identify the microbial abundance in a sample using the sequence databases of known microorganisms and then use the abundance values as features for predicting diseases i.e., Liver Cirrhosis, Type-2 diabetes and other diseases. The taxonomic profiling and abundance quantification is computationally expensive, creates a bias in subsequent predictions and ignores a large amount of data that comes from the Next Generation Sequencing (NGS) technologies. Moreover, most microbes have not been laboratory-cultured and thus remain unknown. Existing approaches do not account for novel and unknown microorganisms. The lack of efficient analytical methods that overcome these limitations impedes the identification of the presence and functions of the microbial organisms within different clinical and environmental samples. Hence, there is a need to develop scalable analytical algorithms for large-scale DNA sequence data i.e., metagenomic data to discover the microbiome, perform taxonomic profiling, quantify species abundance and predict diseases. In this thesis, I develop Multiple Instance Learning (MIL) based algorithms to predict the diseases from large-scale Metagenomic data. Multiple Instance Learning (MIL) is a supervised classification approach that considers a single sample as a group of relevant data instances rather than just one single instance. In addition to predicting diseases, our proposed approaches can identify the individual microbial DNA sequences that are indicative of the diseases. We hypothesize that an optimized solution to the MIL formulation of the problem will predict diseases more accurately than existing approaches by utilizing the available DNA sequence data and avoiding the inherent bias from the microbial profiling process. To ensure that the proposed algorithms can scale to the large volume of input sequences (obtained from a Metagenomic sample) we propose efficient canopy based clustering solutions that can be integrated within the prediction pipeline. We evaluate the proposed algorithms on several clinical benchmarks and show improved prediction performance in terms of identifying clinical phenotypes, reporting interpretable results for clinicians and ensuring scalable implementations.

TOPICAL NAME USED AS SUBJECT

Artificial intelligence
Bioinformatics
Computer science
Epidemiology
Genetics
Microbiology

PERSONAL NAME - PRIMARY RESPONSIBILITY

Rahman, Mohammad Arifur
Rangwala, Huzefa

ELECTRONIC LOCATION AND ACCESS

Electronic name
 مطالعه متن کتاب 

p

[Thesis]
276903

a
Y

Proposal/Bug Report

Warning! Enter The Information Carefully
Send Cancel
This website is managed by Dar Al-Hadith Scientific-Cultural Institute and Computer Research Center of Islamic Sciences (also known as Noor)
Libraries are responsible for the validity of information, and the spiritual rights of information are reserved for them
Best Searcher - The 5th Digital Media Festival