Unit – I
Introduction: The need for parallelism, Forms of parallelism (SISD, SIMD, MISD, MIMD), Moore's Law and Multi-cores, Fundamentals of Parallel Computers, Communication architecture, Message passing architecture, Data parallel architecture, Dataflow architecture, Systolic architecture, Performance Issues.
Unit – II
Large Cache Design: Shared vs. Private Caches, Centralized vs. Distributed Shared Caches, Snooping-based cache coherence protocol, directory-based cache coherence protocol, Uniform Cache Access, Non-Uniform Cache Access, D-NUCA, S-NUCA, Inclusion, Exclusion, Difference between transaction and transactional memory, STM, HTM.
Unit – III
Graphics Processing Unit: GPUs as Parallel Computers, Architecture of a modern GPU, Evolution of Graphics Pipelines, GPGPUs, Scalable GPUs, Architectural characteristics of Future Systems, Implication of Technology and Architecture for users, Vector addition, Applications of GPU.
Unit – IV
Introduction to Parallel Programming: Strategies, Mechanism, Performance theory, Parallel Programming Patterns: Nesting pattern, Parallel Control Pattern, Parallel Data Management, Map: Scaled Vector, Mandelbrot, Collative: Reduce, Fusing Map and Reduce, Scan, Fusing Map and Scan, Data Recognition: Gather, Scatter, Pack , Stencil and Recurrence, Fork-Join, Pipeline
Unit – V
Parallel Programming Languages: Distributed Memory Programming with MPI: trapezoidal rule in MPI, I/O handling, MPI derived datatype, Collective Communication, Shared Memory Programming with Pthreads: Conditional Variables, read-write locks, Cache handling, Shared memory programming with Open MP: Parallel for directives, scheduling loops, Thread Safety, CUDA: Parallel programming in CUDA C, Thread management, Constant memory and Event, Graphics Interoperability, Atomics, Streams.