• Home
  • Advanced Search
  • Directory of Libraries
  • About lib.ir
  • Contact Us
  • History

عنوان
Power, Performance and Scalability for Big Data Query Languages:

پدید آورنده
Wang, Jin

موضوع

رده

کتابخانه
Center and Library of Islamic Studies in European Languages

محل استقرار
استان: Qom ـ شهر: Qom

Center and Library of Islamic Studies in European Languages

تماس با کتابخانه : 32910706-025

NATIONAL BIBLIOGRAPHY NUMBER

Number
TL8m50s7jz

LANGUAGE OF THE ITEM

.Language of Text, Soundtrack etc
انگلیسی

TITLE AND STATEMENT OF RESPONSIBILITY

Title Proper
Power, Performance and Scalability for Big Data Query Languages:
General Material Designation
[Thesis]
First Statement of Responsibility
Wang, Jin
Title Proper by Another Author
The Machine Learning Challenge
Subsequent Statement of Responsibility
Zaniolo, Carlo

.PUBLICATION, DISTRIBUTION, ETC

Name of Publisher, Distributor, etc.
UCLA
Date of Publication, Distribution, etc.
2020

DISSERTATION (THESIS) NOTE

Body granting the degree
UCLA
Text preceding or following the note
2020

SUMMARY OR ABSTRACT

Text of Note
In the Big Data era, there is a resurgence of interest in using Datalog to express data analysis applications that require recursive computations. However, the use of non-monotonic aggregates in recursion raises difficult semantic issues. Recent theoretical advances like monotonic aggregation and Pre-Mapability (PreM) provide the formal semantics for the usage of aggregates in recursive Datalog rules enabling the expression of a wide spectrum of advanced analytical tasks, such as graph analysis, data mining, machine learning and stream processing. In this dissertation, we explore opportunities and issues created by these advances, including the expressiveness of Datalog in advanced applications and their optimization to achieve superior performance and scalability. Firstly, we find that Datalog serves as an efficient query language that simplifies the writing of machine learning applications and provides a unified environment for their development and deployment on multiple platforms. Following this route, we propose a declarative machine learning framework of tested effectiveness on top of Apache Spark. We present an in-depth theoretical analysis that shows how key ML algorithms can be expressed and efficiently implemented by recursive Datalog programs that use aggregates in recursion, whereby achieving both formal and efficient operational semantics. We also present the compilation and optimization techniques we developed to support the complex recursive queries required by ML applications in distributed share-nothing architectures. Next we share some theoretical results to show that programs computing any aggregates on sets of facts of predictable cardinality are equivalent to stratified programs where the pre-computation of cardinality of the set is followed by a stratum where recursive rules only use monotonic constructs. Finally, we investigate how to improve the parallelism of semi-naive evaluation of recursive Datalog programs on shared-memory multi-core machines, and discuss the prototype system we have developed and the high performance levels it delivers.

PERSONAL NAME - PRIMARY RESPONSIBILITY

Wang, Jin

PERSONAL NAME - SECONDARY RESPONSIBILITY

Zaniolo, Carlo

CORPORATE BODY NAME - SECONDARY RESPONSIBILITY

UCLA

ELECTRONIC LOCATION AND ACCESS

Electronic name
 مطالعه متن کتاب 

p

[Thesis]
276903

a
Y

Proposal/Bug Report

Warning! Enter The Information Carefully
Send Cancel
This website is managed by Dar Al-Hadith Scientific-Cultural Institute and Computer Research Center of Islamic Sciences (also known as Noor)
Libraries are responsible for the validity of information, and the spiritual rights of information are reserved for them
Best Searcher - The 5th Digital Media Festival