联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2023-06-03 06:13

Microprocessor Design Clinic

ELEN 90093

Design Assignment

Background:

Machine learning (AI) is an exciting new field. Many systems are now being made adaptive to

contend with data, inputs and situationsthat have not been encountered during design and

training.

A critical component inmachine learning algorithmsismatrix multiplications. Matrix multiplications

can be very time-consuming and taxing for the CPU, where each data is loaded from memory and

processed in the ALU.

Forthis design project, you are asked to design a Matrix Multiplication accelerator and interface it

to the RISC-V CPU.

Instructions:

In groups of three (you and 2 of your fellow students), please undertake the following design

project.

Presentation: Week of the 12th of June (Please contact Tutor and Lecturer) to organise a mutually

convenienttime

Report Due Date: June 16th, 5pm

Design Specifications:

1. The accelerator must be able to multiply 2 fixed-size matrices where each matrix is(32x32)

(ie 1024) elements large.

2. The elements in the input matrices are of type floating point, IEEE 754 compliant, numbers.

3. The output matrix elements are to be floating point values, IEEE 754 compliant.

4. The input matrices input to the accelerator will need to be preserved (must not

be overwritten)

5. The matrices values are to be held in the main memory.

6. The CPU needs to configure the accelerator for the matrix multiplication it is requested to

perform.

7. The accelerator may have its own memory management (e.g., register or buffer) to load and

store the data operands and outputs. Ideally, memory transfers should occur

independently of the CPU (hint DMA).

8. The accelerator may have its own internal memory, but total internal memory

should be limited.

9. You should design new R-Format instructions for the CPU to configure and interact with the

accelerator and supportthe functionality of your accelerator.

10. The accelerator should indicate to the CPU that its calculation has finished. The

mechanism by which this occurs is your design decision.

11. The accelerator should let the processor know if the calculation has completed

correctly and the result can be depended upon or some errors have occurred.

12. Provide test code (using RoCC macros to include your functions in custom instructions) that

showsthe performance speed up that your accelerator achievesrelative to using the base

processor.

Optional

1. Modify your design to support the arbitrary size matrices. The size of the matrices is to be

determined and / orspecified atrun-time and can change at each operation.

Marks:

Presentation: (50%) (Group Presentation, Conducted in Question and Answer Format)

During the presentation, you will need to justify your design decisions and your choices for your

implementation. You will need to also present performance results and testing you have undertaken

to verify the correct performance of your design.

Report: (50%)

Please provide a detailed report outlining your design. The report should include:

1. Adescription and information supporting your design decisions. Illustrate all

the functional blocks of your design and their interaction.

2. Your design schematics, data flow and behavior of the accelerator

3. Test code that exercises / tests your design

4. Performance Analysis of your test design

Note:

This is a design project. There is no single correct answer. Many answers are correct and

valid. The process is one of trading-off various aspects of the design components.

However, at the completion of your project, the accelerator should be able to correctly

multiply the two matrices.

Hint:

Design flow:

1. Implement an IEEE 754 floating-point multiplier and accumulator.

2. Define how data for your input and output matrices are to be stored in memory

addressable by your CPU.

3. Consider how you would transfer the data to and from your accelerator without

requiring the CPU to perform the transfer (ie DMA) and when can the calculation

begin.

4. Define how your accelerator is going to interact with the rest of the processor

microarchitecture (ie design the accelerator interface)

5. Consider the errors that may occur. How are these errors going to be reported? How

is the CPU going to know that the end of the calculation has been completed

successfully?

6. Algorithm: choose an algorithmfor yourmatrixmultiplier

7. Custom Instruction: indicate your input and output matrices using source registers (rs1 and

rs2) and destination register(rd); list all of functionalitiesfor your accelerator by using

funct7

8. Block diagram: Draw a block diagram to include important modules (registers, FSM,

buffers…) and signals (data, handshake, controlling…)

9. FSM: design an FSM for your data flow with states, input signals, state values…

10. Programming: implement your accelerator in chisel language based on your block diagram

and FSM (refer to workshop 4)

11. Testing:write a test programin c (include yourmacros)to testthe performance

of accelerator.

12. Provide the test code so that we can input 2 matrices and validate the

output


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp