College of Engineering 
Department of Computer Science and Engineering

CSE 820: Advanced Computer Architecture

Supplemental Material: Spring 2008


Readings


Assignments

  1. Assignment 1. Due Thursday, February 12
    Comment on the "Great ISA debate" papers on Angel. Two pages total. Specifically address (but do not limit to):
  2. Assignment 2. Due Thursday, Februrary 19
    How is a TLB miss handled in the presence of a physically-addressed Level 1 cache?
    The dilemma is that to get a new TLB entry one needs correct values in the TLB, but a TLB fault appears to mean that they are not there. You will need to make assumptions -- remember to state them. One page should be sufficient, but overflowing into a second page is allowed.
    1. Begin with the TLB miss and end when the correct TLB value has been loaded. Assume that the desired portion of the page table is resident in memory, but it is not in the cache. For simplicity assume that there is only a L1 cache, no L2.
    2. At boot time there is a "chicken-and-egg" dilemma. At boot time the TLB is empty yet TLB values are needed immediately. Describe the steps to get useful values in the TLB.
  3. Assignment 3. Due Thursday, April 2
    The CRAY-1 computer system by R.M. Russell, Communications of the ACM, January 1978.
    You can get access through the MSU Libraries ACM Digital Library account (a proxy is available if you are off campus).
    You may find The CRAY-1 computer system manual a useful resource.
    Analyze the Cray-1 in two pages.
      The Cray-1 is a vector processor. What does that mean? Why does that make it appropriate for scientific processing? Some have called the Cray-1 a RISC processor -- what RISC characteristics does it have? The Cray-1 is a thirty-year-old computer: what features does it have that are found in modern microprocessors? We have found exception handling to be critical: how are exceptions handled in the Cray-1? We have seen that loads are critical for fast processing: what does the Cray-1 do to help with loads? Be sure to comment on vector vs. scalar speed, minimum size vector faster than scalar loop, relative speed to other computers (at that time), clock rate, size of register state, memory size, number of functional units, and general impressions compared to today's CPUs

  4. Assignment 4. Due Thursday, April 16
    Worth double any previous assignment.
    ACM Special Issue: Beyond silicon: new computing paradigms (MSU electronic resource)
    Read Write four or five pages:
  5. Assignment 5. Due Thursday, April 23
    Read "Larrabee: A Many-Core x86 Architecture for Visual Computing" by Seiler et al. (Alternative link).
    In the paper the authors state "we believe that Larrabee is an appropriate platform for the convergence of GPU and CPU applications." Many people in online forums have assumed that Larrabee is designed to serve as a graphics co-processor, but that quote states otherwise. Clearly it has been designed to handle graphics applications, but what about its abilities to handle "CPU applications?"
    Write two pages: While I have asked you to focus on hardware aspects of Larrabee, software cannot be ignored since Larrabee has moved some GPU tasks from hardware to software.

OLD ASSIGNMENTS

  • Assignment 3. Due Thursday, February 14
    "Limits of Instruction-level Parallelism" by David Wall.
    Read pages 1-35 (more than half the paper is figures).
    Write a page about the paper. In your comments
    1. rank in order of importance: alias analysis, branch prediction, jump prediction, register renaming, and speculative execution. Briefly describe why they are in the order you chose.
    2. mention what are the limits to this study of limits of ILP, i.e. why isn't this the whole story.


  • Assignment 6. Due Thursday, April 10
    QoS Policies and Architecture for Cache/Memory in CMP Platforms by Iyer, et al.
    Cache and memory in the many-core era by Stokes
    The Intel QoS paper describes a way to address contention in multi-core processor resources. The description of the problem is particularly good. The ArsTechnica paper provides a summary and some further insight.
    Imagine that you work for AMD. Your boss has asked for your analysis of QoS in this environment. Do you think that Intel will implement QoS? Should AMD implement it?
    In two pages, summarize the critical issues and then argue why or why not, this should be implemented.