High Performance Computer Architecture

High Performance Computer Architecture

High Performance Computer Architecture

Lessons

  1. Module-01 Introduction and Course Outline

    1. What is the Historical perspective of Computers?

      ..
    2. What are the Five Generations of Electronic Computers?

      ..
    3. What are the Elements of Modern Computers and what is Instruction set architecture (ISA)?

      ..
    4. How Computer Architecture differs from Computer Organization?

      ..
    5. What is Moore's Law and its interpretation?

      ..
    6. How to improve the performance of Processor?

      ..
    7. What is Thread-level Parallelism?

      ..
    8. What is Process-Level Parallelism and what are the objectives of this course?

      ..
    prutor.ai
  2. Module-02 Performance

    1. How to measure the performance?

      ..
    2. How to define performance in terms of Time and what is Execution Time?

      ..
    3. What is the Iron Law of Processor Performance?

      ..
    4. How to enhance Processor Performance and what is the example of MIPS?

      ..
    5. How to measure performance using benchmarks?

      ..
    6. What is Amdahl's Law?

      ..
    prutor.ai
  3. Module-03 Instruction Set Architecture

    1. What is Instruction Set Architecture (ISA)?

      ..
    2. What are the various ISAS design choices?

      ..
    3. How different architectures are compared?

      ..
    4. How data transfer takes place?

      ..
    5. What are the different Addressing Modes?

      ..
    6. What is the controversy of RISC/CISC, what are the features of a CISC and RISC processor?

      ..
    prutor.ai
  4. Module-04 MIPS ISA Processor Part-1

    1. What are the different operands in MIPS?

      ..
    2. What are the different conventions for register usage?

      ..
    3. What are the various data types in MIPS?

      ..
    4. What are the various addressimg modes in MIPS?

      ..
    5. What does MIPS Instruction Set survey says?

      ..
    6. How to compare single and multi-clock cycle design?

      ..
    prutor.ai
  5. Module-04 MIPS ISA Processor Part-2

    1. What is the Design summary and the basic abstract view of the data path?

      ..
    2. What is the data path for instruction fetching and R-type instruction and how to add different data paths?

      ..
    3. How to combine these data paths?

      ..
    4. What is the Truth Table for Main Control Unit?

      ..
    5. How to add the Unconditional Jump?

      ..
    prutor.ai
  6. Module-05 Pipelining Introduction

    1. What are the various Pipelining Instructions and its example?

      ..
    2. What is a Synchronous and Asynchronous Pipeline and what are the different Pipeline concepts?

      ..
    3. What is an Ideal Pipeline Speedup and what are the different types of Pipeline?

      ..
    4. What is a Pipelined Fixed Point Multiplier?

      ..
    5. What is a Pipelined Floating Point Adder?

      ..
    prutor.ai
  7. Module-06 Instruction Pipelining

    1. What is an Instruction Pipeline?

      ..
    2. How to implement Instruction Pipeline and what is the datapath for simple RISC?

      ..
    3. How to implement Pipelining in a RISC -like Processor and Execute?

      ..
    4. How to access memory and what is the CPI for the Multiple-Cycle Implementation?

      ..
    5. What is the CPI for the Multiple-Cycle Implementation and what are Pipeline Registers?

      ..
    6. How Pipeline Register are depicted and why Pipelining RISC Processors is easy?

      ..
    7. How to add Pipeline Registers and what are the limits of Pipelining?

      ..
    prutor.ai
  8. Module-07 Pipeline Hazards

    1. What is Speedup and how optimal number of pipelines improve the performance?

      ..
    2. What are Pipeline Hazards?

      ..
    3. What is a Structural Hazard and what are the common methods to eliminate them?

      ..
    4. What is Data Hazard and what are Program and Data Dependences?

      ..
    5. How to detect Data Dependences and what are Name Dependences and its types?

      ..
    6. What is Control Dependence and what are the different aspects of Data Hazard?

      ..
    prutor.ai
  9. Module-08 Data Hazards

    1. How Data Hazards are classified and what are the different techniques to reduce Data Hazards?

      ..
    2. What is Forwarding and Bypassing technique to reduce data hazard?

      ..
    3. What is Basic Compiler Pipeline Scheduling?

      ..
    4. What is the Loop program to implement Scheduling?

      ..
    5. What is Loop unrolling?

      ..
    6. What is Loop unrolling with Scheduling?

      ..
    prutor.ai
  10. Module-09 Software Pipelining

    1. What is the concept of Software Pipelining?

      ..
    2. How Static loop unrolling is illustrated using an example?

      ..
    3. What is Software Pipelined Code, what are the limitations of Scalar Pipelines what are the two paths to Higher ILP?

      ..
    4. What is the need of Dynamic Instruction Scheduling?

      ..
    5. What is Dynamic Instruction Scheduling?

      ..
    prutor.ai
  11. Module-10 In Quest of Higher ILP Part-1

    1. What is HIgher ILP Processor?

      ..
    2. What are the two paths to Higher ILP and what is VLIW Processor?

      ..
    3. What is the basic VLIW approach?

      ..
    4. What are the different examples of VLIW and Transmeta's Crusoe Processor?

      ..
    5. What is Code Morphing Software and how PC software is translated to VLIW?

      ..
    6. How Dynamic Software Execution takes place and what are the different problems with VLIW?

      ..
    prutor.ai
  12. Module-10 In Quest of Higher ILP Part-2

    1. What is Instruction pipeline cycle and how to classify ILP Machines and what are the limitations of Scaler Pipelines?

      ..
    2. What are the two Paths to higher ILP and what are the various drawbacks of VLIW?

      ..
    3. What are the Limits on ILP and what is the motivation behind Superscalar processor?

      ..
    4. What arethe two Paths to higher ILP and what is the proposal for Superscalar processor?

      ..
    5. What is the Superpipelined Organization?

      ..
    6. How to classify ILP Machines, what is Superpipelined Performance and what ate the limitations of Scalar Pipelines?

      ..
    prutor.ai
  13. Module-11 Dynamic Instruction Scheduling Part-1

    1. What is the need of Dynamic Instruction Scheduling and how to design a Superscalar Pipeline?

      ..
    2. How Dataflow Execution takes place and wha are the advantages of Dynamic Scheduling?

      ..
    3. What is Dynamic Instruction Scheduling and Scoreboarding?

      ..
    4. What is Instruction Parallelism and what are the implications of Scoreboard and four Stages of Scoreboard Control?

      ..
    5. What is the Detailed Scoreboard Pipeline Control and how to assess Scoreboarding?

      ..
    6. What is the example to illustrate Scoreboarding?

      ..
    prutor.ai
  14. Module-11 Dynamic Instruction Scheduling Part-2

    1. What is Dynamic Scheduling and Summary of Scoreboard?

      ..
    2. What is Tomasulo's Algorithm and its example?

      ..
    3. How Tomasulo's Algorithm differs from Scoreboarding and what is Tomasulo's Scheme?

      ..
    4. What are the key innovations in Dynamic Instruction Scheduling and what is Reservation station and Tomasulo's algorithm?

      ..
    5. What are the different stages in Tomasulo's Algorithm and its example?

      ..
    6. How to illustrate Tomasulo's algorithm with tthe help of an example?

      ..
    7. What are the advantages of Tomasulo's Scheme and its drawbacks?

      ..
    prutor.ai
  15. Module-12 Control Hazards

    1. What are Control (Branch) Hazards?

      ..
    2. What is the Branch Penalty of 3 Cycles Stall and how to reduce Branch Penalty to 1 cycle and what are Control Instruction Statistics?

      ..
    3. How to Deal with Control Hazards?

      ..
    4. What are Delayed Branches?

      ..
    5. What is the example of Delayed Branches and how to schedule the Branch- Delay Slot?

      ..
    6. What is the performance for Different Alternatives and importance of Stall Reduction?

      ..
    prutor.ai
  16. Module-13 Branch Prediction Part-1

    1. What is the importance of Stall Reduction?

      ..
    2. What is Control Hazard and Branch Prediction?

      ..
    3. What is the limitation of 1-bit predictor and its example?

      ..
    4. What is a 2-bit Dynamic Branch Prediction Scheme, a 2-bit Predictor and its Prediction Accuracy?

      ..
    5. What is a Correlating Branch Predictor?

      ..
    6. What is the prediction accuracy of Correlating Predictor and its example?

      ..
    7. What is the example for 1-bit Predictor and (1,1) Correlating Predictor?

      ..
    prutor.ai
  17. Module-13 Branch Prediction Part-2

    1. What are the various Branch Prediction Schemes and what is a 1-bit and 2-bit Branch Predictor?

      ..
    2. What are Tournament Predictors?

      ..
    3. What Fraction of predictions is coming from the local predictor and how the performance comparison of the Predictors is done?

      ..
    4. What is the need of Branch Target Buffers?

      ..
    5. What is Prediction and Address and what are Branch Target Buffers?

      ..
    6. How to compibe Target and Prediction Buffers?

      ..
    7. What are Return Address Predictors and what are the misprediction rates for Different Sizes of Return Stack (SPEC CPU95) and what is Branch Folding?

      ..
    8. How predictors are used in Pentium processors and what is the overall summary of Dynamic Branch Prediction?

      ..
    prutor.ai
  18. Module-14 Dynamic Instruction Scheduling with Branch Prediction

    1. What are the Data-Flow Architectures and what is Dynamic Instruction Scheduling?

      ..
    2. What is the example of Tomasulo's Loop and what are the possible hazards Due to out-of-order Execution?

      ..
    3. What is the example of Loop execution (part - 1)?

      ..
    4. What is the example of Loop execution (part - 2)?

      ..
    5. Why can Tomasulo's Scheme Overlap Iterations of Loops and what are its advantages and drawbacks?

      ..
    prutor.ai
  19. Module-15 Hardware Based Speculation

    1. What is the Hardware-Based Speculation?

      ..
    2. How to add the speculation to Tomasulo's Scheme and what are the Exceptions, Interrupts and Major Changes Over Tomasulo's Scheme?

      ..
    3. What is the Support for Shadow Execution and what is Recorder Buffer?

      ..
    4. What are the four steps of Speculative Execution and how Tomasulo's algorithm with Reorder Buffer looks like?

      ..
    5. How to avoid Memory Hazards and its examples?

      ..
    6. What is the Multiple Issue without and with Speculation?

      ..
    7. What are the advantages of Speculation and how it differs from Heat Dissipation?

      ..
    prutor.ai
  20. Module-16 Tutorial - I

    1. How problems are illustrated and solved (Part - 1)?

      ..
    2. How problems are illustrated and solved (Part - 2)?

      ..
    3. How problems are illustrated and solved (Part - 3)?

      ..
    4. How problems are illustrated and solved (Part - 4)?

      ..
    prutor.ai
  21. Module-17 Hierarchical Memory Organization Part-1

    1. What is Von Neumann Computer Architecture?

      ..
    2. What are the Key Characteristics of Computer Memory Systems (Location, Capacity and Access Methods)?

      ..
    3. What are the Key Characteristics (Performance and Physical Type)?

      ..
    4. What are the Key Characteristics (Organization)?

      ..
    5. What are the Key Characteristics (Storage Capacity and Cost)?

      ..
    6. What is Hierarchical Memory Organization (Part - 1)?

      ..
    7. What is Hierarchical Memory Organization (Part - 2)?

      ..
    prutor.ai
  22. Module-17 Hierarchical Memory Organization Part-2

    1. What are the basic principles of Cache Memory?

      ..
    2. What are the basic issues related to Cache Memory?

      ..
    3. What is Block Identification?

      ..
    4. What is Direct Mapping (Part - 1)?

      ..
    5. What is Direct Mapping (Part - 2)?

      ..
    6. What is Associative Mapping?

      ..
    prutor.ai
  23. Module-17 Hierarchical Memory Organization Part-3

    1. What are Mapping Functions and what is Fully Associative Mapping?

      ..
    2. What is Set-Associative Mapping?

      ..
    3. How Size of Tags and Associativity are compared?

      ..
    4. What is the issue of replacement algorithms in Set-Associative Mapping?

      ..
    5. What is the issue of a write in Set-Associative Mapping?

      ..
    6. What is the issue of Block Size in Set-Associative Mapping?

      ..
    7. What is the issue of Block Size in Set-Associative Mapping and what is unified, split memory and caches?

      ..
    prutor.ai
  24. Module-17 Hierarchical Memory Organization Part-4

    1. Can we look at the Case Study of the Alpha 21264 Cache?

      ..
    2. What is Alpha 21264 Data Cache, Memory System Performance and Average Memory Access Time (AMAT)?

      ..
    3. What is Cache Performance, its example and parameters?

      ..
    4. How to improve Cache Performance (Part - 1)?

      ..
    5. How to improve Cache Performance (Part - 2)?

      ..
    6. How to improve Cache Performance (Part - 3)?

      ..
    7. What is Virtual Cache and how to access Pipelined cache?

      ..
    prutor.ai
  25. Module-18 Cache Optimization Techniques Part-1

    1. How to reduce Miss rate and how to classify Cache Misses?

      ..
    2. What is 3Cs Absolute Miss Rate (SPEC92)?

      ..
    3. What are the insights of cache, what will be impact of Larger Cache, higher associativity and Larger block size?

      ..
    4. What are Compiler Optimizations, how to reduce Misses by Compiler Optimizations and what are the examples of Merging Arrays and Loop Interchange?

      ..
    5. What is Row-Major, Column-Major and example of Loop Interchange and Loop Fusion?

      ..
    6. What is the example of Blocking and Dense Matix Multiplication?

      ..
    prutor.ai
  26. Module-18 Cache Optimization Techniques Part-2

    1. How to reduce Miss Penalty and what are the various definitions of Multi-Level Cache?

      ..
    2. How to compare Global and Local Miss Rates?

      ..
    3. How Write Buffer and Victim Cache reduce Miss Penalty?

      ..
    4. How the concept of Read Priority over Write on Miss to reduce Miss Penalty?

      ..
    5. How to reduce Miss Penalty by Subblock Placement and what is Early start and Critical Word First?

      ..
    6. How to reduce Miss Penalty by Non- blocking Caches and what is the Cache performance for Out of Order Processors?

      ..
    7. What is the concept of Hardware, Software and Controlled Prefetching for reducing Miss Penalty?

      ..
    8. What is the overall summary of Cache Optimization?

      ..
    prutor.ai
  27. Module-19 High Performance Computer Architecture

    1. How the main memory is organized and what are the different types of Semiconductor Memories?

      ..
    2. What is Static RAM (SRAM) Cell?

      ..
    3. How the SRAM chip is organized and what are its Read/Write operations?

      ..
    4. What is Dynamic RAM (DRAM) Cell?

      ..
    5. How the DRAM chip is organized and what are its characteristics?

      ..
    6. What is an EPROM, layout of DRAM pin and what are the various Read Only Memories (ROM)?

      ..
    7. What is an EEPROM and Flash Memory?

      ..
    prutor.ai
  28. Module-20 Main Memory Optimizations

    1. What is the DRAM Memory Gap or Latency and what is the impact of Higher Bandwidth and Wider Memory?

      ..
    2. How to reduce Miss Penalty and what is Interleaved Memory and Memory Bank?

      ..
    3. What is DIMM, what are the advanced DRAM Organizations and what is FPM and EDO DRAM?

      ..
    4. What is Synchronous DRAM (SDRAM), its example and what is Asynchronous DRAM timing?

      ..
    5. What is Dual Inline Memory Module (DIMM), RAMBUS DRAM (RDRAM) and what is the use of RDRAM?

      ..
    prutor.ai
  29. Module-21 Virtual Memory Part-1

    1. Why Virtual Memory is needed?

      ..
    2. What are the Motivations for Virtual Memory?

      ..
    3. What is a Cache for Disk and how Cache Memory differs from Virtual Memory?

      ..
    4. What are the design issues in Virtual Memory Design how Virtual to Physical Address Mapping is done?

      ..
    5. What is the concept of Address Mapping, how address is translated via Page Table and what is the operation of Page Table?

      ..
    6. How address is translated in Paging System and what are Page Faults?

      ..
    7. How to service a Page Fault and what are Page Table Entries?

      ..
    prutor.ai
  30. Module-21 Virtual Memory Part-2

    1. How to make Address Translation Faster and what is the use of a Translation Lookaside Buffer (TLB)?

      ..
    2. What is TLB and Cache Operation how to handle Page Faults and TLB misses?

      ..
    3. How to manage Memory?

      ..
    4. What is the Page Table Organization (Forward Mapped or Hierarchical Page Table)?

      ..
    5. What are the advantages of Two-level Page Table, what is an inverted Page Table and its structure?

      ..
    6. What is Segmentation and Segment Table and how the address translation takes place in it?

      ..
    7. What is Combined Paging and Segmentation and how the address translation takes place in it?

      ..
    prutor.ai
  31. Module-22 Virtual Machines

    1. What is the Memory Address Translation Mechanism in Pentium II and what is Fetch and Placement Policy?

      ..
    2. What are the Basic Replacement Algorithms?

      ..
    3. How Protection is done via Virtual Memory?

      ..
    4. What is Virtual Machine Monitor (VMM)?

      ..
    5. What are the advantages and disadvantages of VM and what a VMM must do?

      ..
    6. What is Processor Virtualization and ISA Support for VMs?

      ..
    7. What is the impact of Virtual Machines over Virtual Memory and what are Process Virtual Machines?

      ..
    prutor.ai
  32. Module-23 Storage Technology Part-1

    1. What are Magnetic Disks?

      ..
    2. How formatting of Magnetic Disks is done and what are the disk areas?

      ..
    3. What is Constant Bit Density and how to Read/Write in a disk?

      ..
    4. What is Seek Time and Transfer Time?

      ..
    5. What is Average Disk Access Time and Locality?

      ..
    6. What are Optical Disks and how to read data from them?

      ..
    7. What is CD-ROM, DVD-ROM and Magnetic Tapes?

      ..
    prutor.ai
  33. Module-23 Storage Technology Part-2

    1. What is Flash Memory and what are Solid State Disks?

      ..
    2. How to combine multiple disks together?

      ..
    3. What are RAIDs and what is RAID-O level?

      ..
    4. What is RAID-1, RAID-2 and RAID-3 level?

      ..
    5. What is RAID-4 and RAID-5 level?

      ..
    6. What is RAID-6 and which RAID Level to choose?

      ..
    prutor.ai
  34. Module-24 Case Studies Part-1

    1. What is the Historical perspective of Intel Processors?

      ..
    2. What are the features of 80186/80188 and 80286 miroprocessors?

      ..
    3. What are the features of 80386 and 80486 microprocessors and their pipelining?

      ..
    4. What are the features and specifications of Intel P5 and P6 family and what is Pentium and its different aspects?

      ..
    5. What are the various features and operations of Intel P6 family?

      ..
    6. What are the various features of Pentium Pro and Pentium II/III?

      ..
    prutor.ai
  35. Module-24 Case Studies Part-2

    1. What are the different features of Pentium 4 and what is Netburst Micro and Instruction Set Architecture?

      ..
    2. How SSE and SSE2, Pentium III and Pentium IV are compared?

      ..
    3. What are the various Caches?

      ..
    4. What is Branch Prediction and Branch Hints?

      ..
    5. What is the concept of Advanced Dynamic Execution?

      ..
    6. What is a System Bus, EPIC,IA-64 and Itanium?

      ..
    prutor.ai
  36. Module-24 Case Studies Part-3

    1. What is Pentium 2, 3, 4, EPIC and IA-64?

      ..
    2. What is Itanium and main ideas of EPIC and IA-64 Register model?

      ..
    3. What are Register Windows and IA-64 Micro-Architecture?

      ..
    4. What is the General Organization of the IA-64 Architecture, its instruction, Template Field and Stops?

      ..
    5. What is Branch Prediction, IA-64 Solution to Memory Latencies and Speculative Load example?

      ..
    6. What are the other features of IA-64, Itanium Pipeline, functional units and Itanium II?

      ..
    prutor.ai
  37. Module-25 Multithreading and Multiprocessing

    1. What is a Thread and Process -Level Parallelism and how a process is different from thread?

      ..
    2. What are Single and Multithreaded Processes and How can Threads be Created?

      ..
    3. Can we look at the case for Processor Support for Thread-level Parallelism and its example?

      ..
    4. What is a Taxonomy of Parallel Architectures?

      ..
    5. How MIMD Computers are classified and what is shared and distributed memory?

      ..
    prutor.ai
  38. Module-26 Simultanoues Multithreading

    1. What is Multithreading within a Single Processor?

      ..
    2. How Multithreading is explained pictorially?

      ..
    3. What is Coarse-Grained Multithreading and its processors?

      ..
    4. What Fine-Grained Multithreading and Simultaneous Multithreading (SMT)?

      ..
    5. What are the advantages of SMT and its issues?

      ..
    6. What is the block diagram of SMT, its model, caching and Performance Implications?

      ..
    prutor.ai
  39. Module-27 Symmetric Multiprocessors

    1. How UMA and NUMA Computers are compared and what are the different SMP Organizations, its Pros and Cons?

      ..
    2. Why Multicores are needed and what are the Cache Organizations for Multicores and SMPs?

      ..
    3. What is Cache Coherence problem, its Possible Approaches and Solutions?

      ..
    4. What is Snooping protocol and its categories and Cache coherence problem?

      ..
    5. What is a Snoopy-Cache State Machine-I and Machine-II?

      ..
    prutor.ai
  40. Module-28 Distributed Memory Multiprocessors

    1. What are the Limitations of SMPs what is Directory-based Protocol?

      ..
    2. What is the Directory-Based Solution for NUMA computers?

      ..
    3. What is the State Transition Diagram for the Directory?

      ..
    4. What is a Directory State Machine and its example?

      ..
    5. How communication overhead is explained with the help of examples?

      ..
    prutor.ai
  41. Module-29 Cluster, Grid and Cloud Computing

    1. What is Cluster Computing and motivation behind it?

      ..
    2. What are the components, configurations, their types, Pros and Cons?

      ..
    3. What are Storage Area Networks (SANs), Flat Neighbourhood Neetworks and what is Beowulf Cluster and its example?

      ..
    4. What are the SMPs Clusters and what is the concept of Grid Computing?

      ..
    5. What are Cluster Grids, Desktop Grids (SETI@home) and components of Grid Middleware?

      ..
    6. What is Cloud Computing, its benefits and segments?

      ..
    prutor.ai