عنوان

Accelerators for convolutional neural networks

پدید آورنده

/ Arslan Munir, Joonho Kong, Mahmood Azhar Qureshi

موضوع

Neural networks (Computer science),a04

رده

کتابخانه

Library of College of Science University of Tehran

محل استقرار

استان: Tehran ـ شهر: Tehran

تماس با کتابخانه : 61112616-66495290-021

INTERNATIONAL STANDARD BOOK NUMBER

(Number (ISBN

9781394171880

NATIONAL BIBLIOGRAPHY NUMBER

Number

E4287

LANGUAGE OF THE ITEM

.Language of Text, Soundtrack etc

انگلیسی

TITLE AND STATEMENT OF RESPONSIBILITY

Title Proper

Accelerators for convolutional neural networks

General Material Designation

[Electronic book]

First Statement of Responsibility

/ Arslan Munir, Joonho Kong, Mahmood Azhar Qureshi

.PUBLICATION, DISTRIBUTION, ETC

Place of Publication, Distribution, etc.

Hoboken, New Jersey

Name of Publisher, Distributor, etc.

: ohn Wiley & Sons, Inc.,

Date of Publication, Distribution, etc.

, 2024.

PHYSICAL DESCRIPTION

Specific Material Designation and Extent of Item

1 online resource

CONTENTS NOTE

Text of Note

About the Authors xiii -- Preface xv -- Part I Overview 1 -- 1 Introduction 3 -- 1.1 History and Applications 5 -- 1.2 Pitfalls of High-Accuracy DNNs/CNNs 6 -- 1.2.1 Compute and Energy Bottleneck 6 -- 1.2.2 Sparsity Considerations 9 -- 1.3 Chapter Summary 11 -- 2 Overview of Convolutional Neural Networks 13 -- 2.1 Deep Neural Network Architecture 13 -- 2.2 Convolutional Neural Network Architecture 15 -- 2.3 Popular CNN Models 26 -- 2.4 Popular CNN Datasets 30 -- 2.5 CNN Processing Hardware 31 -- 2.6 Chapter Summary 37 -- Part II Compressive Coding for CNNs 39 -- 3 Contemporary Advances in Compressive Coding for CNNs 41 -- 3.1 Background of Compressive Coding 41 -- 3.2 Compressive Coding for CNNs 43 -- 3.3 Lossy Compression for CNNs 43 -- 3.4 Lossless Compression for CNNs 44 -- 3.5 Recent Advancements in Compressive Coding for CNNs 48 -- 3.6 Chapter Summary 50 -- 4 Lossless Input Feature Map Compression 51 -- 4.1 Two-Step Input Feature Map Compression Technique 52 -- 4.2 Evaluation 55 -- 4.3 Chapter Summary 57 -- 5 Arithmetic Coding and Decoding for 5-Bit CNN Weights 59 -- 5.1 Architecture and Design Overview 60 -- 5.2 Algorithm Overview 63 -- 5.3 Weight Decoding Algorithm 67 -- 5.4 Encoding and Decoding Examples 69 -- 5.5 Evaluation Methodology 74 -- 5.6 Evaluation Results 75 -- 5.7 Chapter Summary 84 -- Part III Dense CNN Accelerators 85 -- 6 Contemporary Dense CNN Accelerators 87 -- 6.1 Background on Dense CNN Accelerators 87 -- 6.2 Representation of the CNNWeights and Feature Maps in Dense Format 87 -- 6.3 Popular Architectures for Dense CNN Accelerators 89 -- 6.4 Recent Advancements in Dense CNN Accelerators 92 -- 6.5 Chapter Summary 93 -- 7 iMAC: Image-to-Column and General Matrix Multiplication-Based Dense CNN Accelerator 95 -- 7.1 Background and Motivation 95 -- 7.2 Architecture 97 -- 7.3 Implementation 99 -- 7.4 Chapter Summary 100 -- 8 NeuroMAX: A Dense CNN Accelerator 101 -- 8.1 RelatedWork 102 -- 8.2 Log Mapping 103 -- 8.3 Hardware Architecture 105 -- 8.4 Data Flow and Processing 108 -- 8.5 Implementation and Results 118 -- 8.6 Chapter Summary 124 -- Part IV Sparse CNN Accelerators 125 -- 9 Contemporary Sparse CNN Accelerators 127 -- 9.1 Background of Sparsity in CNN Models 127 -- 9.2 Background of Sparse CNN Accelerators 128 -- 9.3 Recent Advancements in Sparse CNN Accelerators 131 -- 9.4 Chapter Summary 133 -- 10 CNN Accelerator for In Situ Decompression and Convolution of Sparse Input Feature Maps 135 -- 10.1 Overview 135 -- 10.2 Hardware Design Overview 135 -- 10.3 Design Optimization Techniques Utilized in the Hardware Accelerator 140 -- 10.4 FPGA Implementation 141 -- 10.5 Evaluation Results 143 -- 10.6 Chapter Summary 149 -- 11 Sparse-PE: A Sparse CNN Accelerator 151 -- 11.1 RelatedWork 155 -- 11.2 Sparse-PE 156 -- 11.3 Implementation and Results 174 -- 11.4 Chapter Summary 184 -- 12 Phantom: A High-Performance Computational Core for Sparse CNNs 185 -- 12.1 RelatedWork 189 -- 12.2 Phantom 190 -- 12.3 Phantom-2D 201 -- 12.4 Experiments and Results 209 -- 12.5 Chapter Summary 218 -- Part V HW/SW Co-Design and Co-Scheduling for CNN Acceleration 221 -- 13 State-of-the-Art in HW/SW Co-Design and Co-Scheduling for CNN Acceleration 223 -- 13.1 HW/SW Co-Design 223 -- 13.2 HW/SW Co-Scheduling 228 -- 13.3 Chapter Summary 230 -- 14 Hardware/Software Co-Design for CNN Acceleration 231 -- 14.1 Background of iMAC Accelerator 231 -- 14.2 Software Partition for iMAC Accelerator 232 -- 14.3 Experimental Evaluations 235 -- 14.4 Chapter Summary 237 -- 15 CPU-Accelerator Co-Scheduling for CNN Acceleration 239 -- 15.1 Background and Preliminaries 240 -- 15.2 CNN Acceleration with CPU-Accelerator Co-Scheduling 242 -- 15.3 Experimental Results 251 -- 15.4 Chapter Summary 257 -- 16 Conclusions 259 -- References 265 -- Index 285

SUMMARY OR ABSTRACT

Text of Note

Accelerators for Convolutional Neural Networks Comprehensive and thorough resource exploring different types of convolutional neural networks and complementary accelerators Accelerators for Convolutional Neural Networks provides basic deep learning knowledge and instructive content to build up convolutional neural network (CNN) accelerators for the Internet of things (IoT) and edge computing practitioners, elucidating compressive coding for CNNs, presenting a two-step lossless input feature maps compression method, discussing arithmetic coding -based lossless weights compression method and the design of an associated decoding method, describing contemporary sparse CNNs that consider sparsity in both weights and activation maps, and discussing hardware/software co-design and co-scheduling techniques that can lead to better optimization and utilization of the available hardware resources for CNN acceleration. The first part of the book provides an overview of CNNs along with the composition and parameters of different contemporary CNN models. Later chapters focus on compressive coding for CNNs and the design of dense CNN accelerators. The book also provides directions for future research and development for CNN accelerators. Other sample topics covered in Accelerators for Convolutional Neural Networks include: How to apply arithmetic coding and decoding with range scaling for lossless weight compression for 5-bit CNN weights to deploy CNNs in extremely resource-constrained systems State-of-the-art research surrounding dense CNN accelerators, which are mostly based on systolic arrays or parallel multiply-accumulate (MAC) arrays iMAC dense CNN accelerator, which combines image-to-column (im2col) and general matrix multiplication (GEMM) hardware acceleration Multi-threaded, low-cost, log-based processing element (PE) core, instances of which are stacked in a spatial grid to engender NeuroMAX dense accelerator Sparse-PE, a multi-threaded and flexible CNN PE core that exploits sparsity in both weights and activation maps, instances of which can be stacked in a spatial grid for engendering sparse CNN accelerators For researchers in AI, computer vision, computer architecture, and embedded systems, along with graduate and senior undergraduate students in related programs of study, Accelerators for Convolutional Neural Networks is an essential resource to understanding the many facets of the subject and relevant applications

TYPE OF ELECTRONIC RESOURCE NOTE

Text of Note

PDF file.

TOPICAL NAME USED AS SUBJECT

Entry Element

Neural networks (Computer science)

a04

PERSONAL NAME - PRIMARY RESPONSIBILITY

Munir, Arslan

PERSONAL NAME - ALTERNATIVE RESPONSIBILITY

Kong, Joonho

Qureshi, Mahmood Azhar

ORIGINATING SOURCE

Country

Iran

Agency

University of Tehran. Library of College of Science

ELECTRONIC LOCATION AND ACCESS

Date and Hour of Consultation and Access

UT_SCI_BL_DB_1004434_0001.pdf

278840

عنوان Accelerators for convolutional neural networks

پدید آورنده / Arslan Munir, Joonho Kong, Mahmood Azhar Qureshi

موضوع Neural networks (Computer science),a04

رده

کتابخانه Library of College of Science University of Tehran

محل استقرار استان: Tehran ـ شهر: Tehran

INTERNATIONAL STANDARD BOOK NUMBER

NATIONAL BIBLIOGRAPHY NUMBER

LANGUAGE OF THE ITEM

TITLE AND STATEMENT OF RESPONSIBILITY

.PUBLICATION, DISTRIBUTION, ETC

PHYSICAL DESCRIPTION

CONTENTS NOTE

SUMMARY OR ABSTRACT

TYPE OF ELECTRONIC RESOURCE NOTE

TOPICAL NAME USED AS SUBJECT

PERSONAL NAME - PRIMARY RESPONSIBILITY

PERSONAL NAME - ALTERNATIVE RESPONSIBILITY

ORIGINATING SOURCE

ELECTRONIC LOCATION AND ACCESS

عنوان

Accelerators for convolutional neural networks

پدید آورنده

/ Arslan Munir, Joonho Kong, Mahmood Azhar Qureshi

موضوع

Neural networks (Computer science),a04

کتابخانه

Library of College of Science University of Tehran

محل استقرار

استان: Tehran ـ شهر: Tehran