CSC 406 - Architecture of Parallel ComputersCatalog Description:
The need for parallel and massively parallel computers. Taxonomy of parallel computer architecture, and programming models for parallel architectures. Example parallel algorithms. Shared-memory vs. distributed-memory architectures. Correctness and performance issues. Cache coherence and memory consistency. Bus-based and scalable directory-based multiprocessors. Interconnection-network topologies and switch design. Brief overview of advanced topics such as multiprocessor prefetching and speculative parallel execution. Credit is not allowed for more than one course in this set: ECE 406, ECE 506, CSC 406.
- Lecture: 3 hours
- Explain why parallel architectures are needed, and where and how they are used.
- Compare and contrast different parallel programming models: shared memory (e.g. OpenMP) vs. message passing.
- Describe and apply parallel programming constructs in software: locks, barriers, point-to-point synchronization
- Describe and evaluate multiple parallelization techniques: loop level, task level, and algorithm level.
- Identify correctness issues in parallel programs related to variable scope, synchronization points, and computation ordering.
- Describe common performance bottlenecks related to loop transformations, thread scheduling, locality, page allocation, and false sharing.
- Describe the memory hierarchy and organization for a parallel computer, in particular, cache organization, write policy (write-through vs. write-back), replacement policy.
- Describe a bus-based multiprocessor architecture.
- Describe cache coherence protocols on bus-based machines and evaluate their latency and bandwidth trade-offs.
- Describe hardware support for synchronization primitives.
- Define memory consistency.
- Describe and evaluate directory-based cache coherency on distributed shared memory machines.
- Describe common interconnection networks in use today.
- Overview of parallel computing and programming
- Shared-memory parallel programming
- Physical and logical cache organization
- Cache-coherence problem
- Cache-coherence solutions
- Advanced cache-coherence aspects and challenges (e.g., in CXL and Gen-Z)
- Hardware support for locking and barrier implementations
- Memory Consistency Models
- Advanced Memory Consistency Problems
- Distributed Shared Memory Machines
- Interconnect Topologies and Architectures
- Heterogeneous Compute Architectures
- Emerging Near-Memory Accelerators (e.g., UPMEM)
- Disaggregated Memory Architectures
See Course Listings