عنوان

Homology Based Sequence Alignment and Annotation Algorithms

پدید آورنده

Amin, Mohammad Ruhul

موضوع

Artificial intelligence,Bioinformatics,Computer science

رده

کتابخانه

Center and Library of Islamic Studies in European Languages

محل استقرار

استان: Qom ـ شهر: Qom

تماس با کتابخانه : 32910706-025

NATIONAL BIBLIOGRAPHY NUMBER

Number

TL51620

LANGUAGE OF THE ITEM

.Language of Text, Soundtrack etc

انگلیسی

TITLE AND STATEMENT OF RESPONSIBILITY

Title Proper

Homology Based Sequence Alignment and Annotation Algorithms

General Material Designation

[Thesis]

First Statement of Responsibility

Amin, Mohammad Ruhul

Subsequent Statement of Responsibility

Skiena, Steven

.PUBLICATION, DISTRIBUTION, ETC

Name of Publisher, Distributor, etc.

State University of New York at Stony Brook

Date of Publication, Distribution, etc.

2019

GENERAL NOTES

Text of Note

97 p.

DISSERTATION (THESIS) NOTE

Dissertation or thesis details and type of degree

Ph.D.

Body granting the degree

State University of New York at Stony Brook

Text preceding or following the note

2019

SUMMARY OR ABSTRACT

Text of Note

Research in bioinformatics is driven to analyze and interpret biological sequences. Analysis of biological sequences begins with alignment, while their interpretation in terms of biological function begins with annotation. With the rapid development of high-throughput genome sequencing techniques, alignment and annotation methods are also evolving. In this thesis, we discuss the shortcomings of current alignment and annotation methods, and present novel algorithms with improved results. The Oxford Nanopore Single Molecule Sequencing technique generates long reads at higher sequencing errors. Popular alignment algorithms, such as LAST and BLAST take considerable processing time for aligning long reads at higher sensitivity, BWA-MEM has the smallest average alignment length and GraphMap aligns many random strings with moderate accuracy. We introduce a novel open-source read alignment tool, called NanoBLASTer, that includes several novel enhancements to maintain high sensitivity and high performance in the presence of high error rates. The advent of large-scale genome sequencing has proven a tremendous boost to research in the life sciences. However, published genomes have been shown to be very uneven in terms of both sequence and annotation quality, reducing dramatically in both aspects as we enter the long tail of non-model organisms. We present methods to identify massive numbers of prokaryotic sequence annotation errors in public databases and demonstrate that homology and pattern matching techniques can be deployed to solve them. In summary, we have re-annotated 12,495 16S rRNA 3' ends, increasing the total number of prokaryotes with 16S rRNAs containing antiSD sequences from 8,153 to 20,648, and increasing the number of organisms known to lack an antiSD from 15 to 128. Finally, we present DeepAnnotator, a deep learning method to solve the problem of genome annotation on a large scale. DeepAnnotator uses Recurrent Neural Network with Long Short-Term Memory to predict the start, stop and coding sequences of a gene and accumulates all those scores by a downstream algorithm to annotate genome sequences. DeepAnnotator establishes a generalized computational approach for genome annotation using deep learning and achieves an F-score of 94%.

UNCONTROLLED SUBJECT TERMS

Subject Term

Artificial intelligence

Subject Term

Bioinformatics

Subject Term

Computer science

PERSONAL NAME - PRIMARY RESPONSIBILITY

Amin, Mohammad Ruhul

PERSONAL NAME - SECONDARY RESPONSIBILITY

Skiena, Steven

CORPORATE BODY NAME - SECONDARY RESPONSIBILITY

State University of New York at Stony Brook

ELECTRONIC LOCATION AND ACCESS

Electronic name

[Thesis]

276903

عنوان Homology Based Sequence Alignment and Annotation Algorithms

پدید آورنده Amin, Mohammad Ruhul

موضوع Artificial intelligence,Bioinformatics,Computer science

رده

کتابخانه Center and Library of Islamic Studies in European Languages

محل استقرار استان: Qom ـ شهر: Qom

NATIONAL BIBLIOGRAPHY NUMBER

LANGUAGE OF THE ITEM

TITLE AND STATEMENT OF RESPONSIBILITY

.PUBLICATION, DISTRIBUTION, ETC

GENERAL NOTES

DISSERTATION (THESIS) NOTE

SUMMARY OR ABSTRACT

UNCONTROLLED SUBJECT TERMS

PERSONAL NAME - PRIMARY RESPONSIBILITY

PERSONAL NAME - SECONDARY RESPONSIBILITY

CORPORATE BODY NAME - SECONDARY RESPONSIBILITY

ELECTRONIC LOCATION AND ACCESS

عنوان

Homology Based Sequence Alignment and Annotation Algorithms

پدید آورنده

Amin, Mohammad Ruhul

موضوع

Artificial intelligence,Bioinformatics,Computer science

کتابخانه

Center and Library of Islamic Studies in European Languages

محل استقرار

استان: Qom ـ شهر: Qom