Parallel Computer Architecture

Credit: 3

Objective

  • To understand the principles of parallel computer architecture

  • To understand the design of parallel computer systems including modern parallel architectures

  • To assess the communication and computing possibilities of parallel system architecture and to predict the performance of parallel applications

 

Unit – I Fundamentals of Computer Design

Defining Computer Architecture – Trends in Technology – Trends in Power in Integrated Circuits – Trends in Cost – Dependability – Measuring, Reporting and Summarizing Performance – Quantitative Principles of Computer Design – Basic and Intermediate concepts of pipelining – Pipeline Hazards – Pipelining Implementation issues.

 

Unit – II Instruction-Level Parallelism and Its Exploitation

Instruction-Level Parallelism: Concepts and Challenges – Basic Compiler Techniques for Exposing ILP – Reducing Branch Costs with Prediction – Overcoming Data Hazards with Dynamic Scheduling – Dynamic Scheduling: Algorithm and Examples – Hardware-Based Speculation – Exploiting ILP Using Multiple Issue and Static Scheduling – Exploiting ILP Using Dynamic Scheduling, Multiple Issue and Speculation – Studies of the Limitations of ILP – Limitations on ILP for Realizable Processors – Hardware versus Software Speculation – Using ILP Support to Exploit Thread-Level Parallelism

 

Unit – III Data-Level and Thread-Level Parallelism

Vector Architecture – SIMD Instruction Set Extensions for Multimedia – Graphics Processing Units – Detecting and Enhancing Loop-Level Parallelism – Centralized Shared-Memory Architectures – Performance of Shared-Memory Multiprocessors – Distributed Shared Memory and Directory Based Coherence – Basics of Synchronization – Models of Memory Consistency – Programming Models and Workloads for Warehouse-Scale Computers – Computer Architecture of Warehouse-Scale Computers – Physical Infrastructure and Costs of Warehouse-Scale Computers

 

Unit – IV Memory Hierarchy Design

Cache Performance – Six Basic Cache Optimizations – Virtual Memory – Protection and Examples of Virtual Memory – Ten Advanced Optimizations of Cache Performance – Memory Technology and Optimizations – Protection: Virtual Memory and Virtual Machines – The Design of Memory Hierarchies

 

Unit – V Storage Systems & Case Studies

Advanced Topics in Disk Storage – Definition and Examples of Real Faults and Failures – I/O Performance, Reliability Measures and Benchmarks – Designing and Evaluating an I/O System – The Internet Archive Cluster

Case Studies / Lab Exercises: INTEL i3, i5, i7 processor cores, NVIDIA GPUs, AMD, ARM processor cores – Simulators – GEM5, CACTI, SIMICS, Multi2sim and INTEL Software development tools.

Outcome

  • Students accustomed with the representation of data, addressing modes, and instructions sets.

  • Students able to understand parallelism both in terms of a single processor and multiple processors

  • Technical knowhow of parallel hardware constructs to include instruction-level parallelism for multi core processor design

 

Text Books

  1. David.A.Patterson, John L.Hennessy, "Computer Architecture: A Quantitative approach", Elsevier, 5th Edition 2012.

  2. K.Hwang, Naresh Jotwani, “Advanced Computer Architecture, Parallelism, Scalability, Programmability”, Tata McGraw Hill, 2nd Edition 2010.