联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2024-10-30 08:54

CSE x25 Lab Assignment 3

Welcome to CSE 125/225! Each lab is formed from one or more parts. Each part is

relatively independent and parts can normally be completed in any order. Each part will teach a

concept by implementing multiple modules.

Each lab will be graded on multiple categories: correctness, style/lint, git hygiene, and

demonstration. Correctness will be assessed by our autograder. Lint will be assessed by

Verilator (make lint). Style and hygiene will be graded by the TAs.

To run the test scripts in this lab, run make test from one of the module directories.

This will run (many) tests against your solution in two simulators: Icarus Verilog, and Verilator.

Both will generate waveform files (.fst) in a directory: run/<test_name and parameter

values>/<simulator name>. You will need to run make extraclean after make test

to clean your previous output files.

You may use any waveform viewer you would like. The codespace has Surfer installed.

It is also an excellent web-based viewer. You can also download GTKWave, which is a bit more

finicky.

Each Part will have a demonstration component that must be shown to a TA or instructor

for credit. We may manually grade style/lint after the assignment deadline. Style checking and

linting feels pedantic, but it is the gold standard in industry.

At any time you can run make help from inside one of the module directories to find all

of the commands you can run.

When you have questions, please ask. Otherwise, happy hardware hacking!

Assignment 3 Repository Link: https://classroom.github.com/a/jgWdhG0R

Part 1

Part 2

Part 3

Part 4

Part 1: Memories as LUTs as Programmable Logic

Déjà vu anyone?

We like to treat FPGAs as a “sea of gates”, but they are not. They are actually made up

of discrete elements, like look-up-tables (LUTs) (Xilinx/AMD, Lattice), multiplexers (Xilinx/AMD),

multipliers (Xilinx/AMD, Lattice) and memories (Xilinx/AMD). The job of the Electronic Design

Automation (EDA) toolchain is to synthesize SystemVerilog into these discrete elements and

then program them.

In this part, the objective is to use actual FPGA primitives to re-create the logic functions

you completed in Lab 1 and 2. In effect, do synthesis by hand.

The following is the instantiation template for a AMD/XILINX LUT6 module:

module LUT6

#(parameter [63:0] INIT = 64'h0000000000000000)

(output O,

,input I0

,input I1

,input I2

,input I3

,input I4

,input I5);

The Look-up-Table (LUT) operates by using I0 through I5 as a 5-bit address that indexes

the bits in the INIT parameter to produce O. For example, if I5 - I0 have the values {1’b0, 1’b0,

1’b0, 1’b1, 1’b0, 1’b0. O will be the value at index 4 in INIT (0 in this example). Edit/Update:

Remember that index 0 of of the bit string 2’b01 is 1 (not 0)

We do not use the Xilinx LUT6 in this lab; we use the SB_LUT4 on your ICE40 FPGA.

This is the SB_LUT4 definition:

module SB_LUT4 (

output O,

input I0,

input I1,

input I2,

input I3

);

parameter [15:0] LUT_INIT = 0;

Like in the Xilinx example, LUT_INIT is the LUT initialization string. The Look-up-Table

(LUT) operates by using I0 through I3 as a 4-bit address that indexes the bits in the INIT

parameter to produce O.

Please complete the following parts using the primitives dictated. All of the FPGA

primitives are available in the provided folder.

● xor2: Using only the Lattice SB_LUT4 module, create a 2-input Exclusive-Or module.

● xnor2: Using only the Lattice SB_LUT4 module, create a 2-input Exclusive-Nor module.

● mux2: Using only the Lattice SB_LUT4 module, create a 2-input multiplexer module.

● full_add: Using only the Lattice ICESTORM_LC module, create a full_add module.

The documentation for this module is here (Page 2-2). Another good reference is here.

You will need to produce sum_o using the internal Look-up-Table and inputs I1, I2, and

CIN. These inputs also connect to “hardened” carry logic (It looks like a mux in the

diagram). These are the relevant lines in the provided module:

wire mux_cin = CIN_CONST ? CIN_SET : CIN;

assign COUT = CARRY_ENABLE ? (I1_pd && I2_pd) || ((I1_pd || I2_pd) &&

mux_cin) : 1'bx;

LUT_INIT, CIN_SET, and CARRY_ENABLE are the three key parameters to set. All the

remaining parameters can be ignored.

● We are removing this module to simplify Lab 3.

adder: Using only the Xilinx CARRY4 module, create a parameterized adder in

adder.sv. You should use a generative for-loop. You will need to handle arbitrary

with_p values. This document has a good (english) description of the ports on Page 44.

CARRY4 adds the inputs S[3:0] and DI[3:0], and produces O[3:0]. However, not all

adds are 4 bits; so the carry output from each bitwise addition is in CO[3:0]. If you are

chaining two CARRY4‘s together, you will use the MSB of CO (CO[3]) in the first CARRY4

as the input to CI of the second CARRY4. If you are adding three 2-bit values, you will

take CO[2] of the first CARRY4, to get the MSB (bit 3) of the addition.

Here is an example of using the CARRY4 (Hopefully, this is a big hint):

CARRY4

CARRY4_i

(.CO(wCarryOut[4*i+4-1:4*i]), // 4-bit carry out

.O(wResult[4*i+4-1:4*i]), // 4-bit carry chain sum

.CI(wCascadeIn[i]), // 1-bit carry cascade input

.CYINIT(1'b0), // 1-bit carry initialization

.DI(wInputB[4*i+4-1:4*i]), // 4-bit carry-MUX data in

.S(wInputA[4*i+4-1:4*i])); // 4-bit carry-MUX select input

● triadder: Sometimes, ripple-carry-adders (chaining the c_o from one Full-Adder to c_i

in the next Full-Adder) are suboptimal. For example, adding three numbers together

requires two ripple carry adders – the longest path through the circuit would be through

all the carry bits. Fortunately, there’s a “faster” approach.

Implement a 3-way adder without using the verilog + operator more than once. You

should use the full_add module (either the one above, or the one from the previous

lab). The key technique here is to use a 3:2 compressor, and then add the resulting 2-bit

output using an adder. This is a good reference: link

Tl:DR: Use the full_add module as a 3:2 compressor, and then add the resulting two

bits to get the final result. This technique generalizes to N inputs if you read further in the

link above.

● shift: Using only the Lattice ICESTORM_LC module, create a parameterized, shift

register identical to Lab 1. The shift register should shift left, and shift in d_i to the

low-order-bit, on the positive edge of clk_i, when enable_i == 1.

The documentation for the ICESTORM_LC module is here (Page 2-2). Another good

reference is here.

These are the key lines from the provided module:

always @(posedge polarized_clk)

if (CEN_pu)

o_reg <= SR_pd ? SET_NORESET : lut_o;

assign O = DFF_ENABLE ? ASYNC_SR ? o_reg_async : o_reg : lut_o;

The key ports for this module are I0, 0, CLK, CEN and SR. The key parameters for this

module are: LUT_INIT (How do you use the LUT_INIT to pass through I0, unmodified?

How do you use it to make a mux, to select the d_i input?), SET_NORESET (Related to

reset_val_p), and DFF_ENABLE (parameter for enabling the D-Flip-Flop. NEG_CLK and

ASYNC_SR must be left at their default values.

Demonstration (All Students):

There is no demonstration for this part.

Part 2: Asynchronous Memories in SystemVerilog

Let’s get some practice with memories. Instead of using the LUTs in the FPGA, let’s use

verilog to describe memories. Since these memories are asynchronous-read, they are

synthesized to registers in the actual fabric.

● ram_1r1w_async: Using behavioral SystemVerilog, create a read-priority (aka read-first)

asynchronous memory with 1 write port and 1 read port. It must implement the

parameters width_p, and depth_p. Read priority means that reads get the old write

data when there is an address collision (i.e. the read happens first).

Your asynchronous memory should initialize using the function $readmemh.

When simulating, you will see a warning like this in icarus: FST warning: array

word ram_1r1w_async.ram[10] will conflict with an escaped identifier. This is OK.

● hex2ssd: Using your ram_1r1w_async, create a module that converts a 4-bit

hexadecimal number into a seven-segment display encoding.

Commit your memory initialization file along with your solution.

● kpyd2hex: Using your ram_1r1w_async memory, create a module that converts from a

keypad (Row, Column) output to a hexadecimal value.

kpyd_i is the one-hot encoding of the row value in the high-order bits, and the column

value in the low-order bits. I think the Icebreaker PMOD pin definitions are the swapped

from the Digilent PMOD definitions. My solution treats Column 1 as 0001, corresponding

to the column with 1/4/7/0, and Row 1 as 0001, corresponding to the row 1/2/3/A.

Commit your memory initialization file along with your solution. You will need to copy

both .hex files into this directory for it to compile to your FPGA.

Demonstration (All Students):

Demonstrate your working Keypad to Seven-Segment Display module on the FPGA by

instantiating your modules in top.sv. Use your keypad to show which button is being pressed on

the seven segment display.

10/23 Notes (Contributed by Raphael):

● You will need to iteratively select columns to determine which column has a button being

pressed. (Do this with a 1-hot shift register/ring counter!)

● You will need to drive the column pins on the keypad, and it will respond with the row

being pressed within that column.

● The kpyd2hex module takes rows and columns as one-hot values (e.g. 00010001).

However, the keypad columns are zero-hot with pull-up resistors. Therefore, the rows

are also zero-hot (see datasheet for more info). For example, if button “1" is being

pressed and we send 1110 to the keypad, it will respond with 1110.

● Finally, the keypad glitches if you send too many requests, so you need to slow the

12MHz clock

Old Notes:

● You do not need to debounce or edge-detect the buttons.

● You will need to iteratively select columns to determine which column has a button being

pressed. (Do this with a 1-hot shift register!)

● The kpyd2hex module takes rows and columns as one-hot values (e.g. 00010001).

However, the keypad columns are zero-hot with pull-up resistors. Therefore, the rows

are also zero-hot.

● You will need to handle the case where no button is pressed.

● It is safe to assume we will only press one button at a time in a column.

● You can use persistence of vision.

Part 3: Elastic Pipelines and FIFOs

We are working our way up to pipelines. There are two types of pipelines: inelastic, and

elastic (we will cover these in class). You can always wrap inelastic pipelines to create elastic

ones. In this lab, you will write an in-elastic pipeline stage. Next, you will create an elastic

pipeline stage. Finally, you will use your memory (from above) to create a FIFO.

You may use whatever operators and behavioral description you prefer, except you may

not use always@(*). You are encouraged to reuse whatever modules see fit from previous

labs or this lab.

● inelastic: Write an inelastic pipeline stage. When en_i is 1, it should save the data.

Otherwise, it should not. When datapath_reset_p == 1, data_o should be reset to 0

if reset_i ==1 at the positive edge of the clock.

You can use /* verilator lint_off WIDTHTRUNC */ around

datapath_reset_p to clear the lint warnings.

Note: This should look a lot like writing a DFF.

● elastic: Write a mealy elastic pipeline stage. You can think of this as a 1-element FIFO,

with a mealy state machine to improve throughput. The module must be Ready Valid &

on the input/consumer interface (ready_o and valid_i) and Ready Valid & (valid_o

and ready_i) on the output/producer.

When datapath_reset_p == 1, data_o should be reset to 0 if reset_i ==1 at the

positive edge of the clock.

When datapath_gate_p ==1, data_o should only be updated when valid_i == 1.

Otherwise, data_o should be updated whenever ready_o == 1. This a very simple

form of “Data Gating”, and is the missing “bit” from the class lecture slides.

10/23 Note: A potentially better way to say above: When datapath_gate_p == 1,

data_o should only be updated when (valid_i & ready_o) == 1. Otherwise, data_o

should be updated whenever ready_o ==1.

● fifo_1r1w: Using behavioral SystemVerilog, your ram_sync_1r1w module, and any

other module you have written, write a First-in-First-Out (FIFO) module. The module

must be Ready Valid & on the input/consumer interface (ready_o and valid_i) and

Ready Valid & (valid_o and ready_i) on the output/producer. This paper and this

google doc have good breakdowns of the interface types.

Demonstration (All Students):

Demonstrate your working FIFO by using it to connect between audio input and output

on your FPGA board. You will plug the PMOD I2S2 into PMOD Port B on your board, and then

use 3.5mm cables to connect to the Audio I/O ports to/from your computer/speaker. You should

set your FIFO to a very small depth (e.g. 2) because the Lattice boards do not have memories

that support the ram_1r1w_async pattern. The output must sound the same as the original

audio for credit.

What is the maximum value for depth_p that you can use on your FPGA, before the

toolchains fail to compile?

Part 4: Sinusoid / Fixed Point Representation

Have you ever heard anyone complain about how complicated IEEE 754 floating point

is? The problem is that it’s easy to use (in software), until it isn’t: List of Failures from IEEE 754.

For this reason, floating point arithmetic isn’t used in many safety critical applications. For the

same reasons, floating point numbers aren’t used in signal processing. Fixed point operations

are vastly less complicated than floating point operations, require vastly less area, and are

numerically stable.

Fixed point arithmetic follows the same rules as normal two’s-complement arithmetic. In

that sense, you already know the basics. The difference is that when two fixed-point numbers

are multiplied, the number of fractional digits/bits increases. For example, .5 * .5, which is

representable with one fractional digit, produces .25, which needs two fractional digits to

represent. In fixed point, the fractional digits represent ½ (.5), ¼ (.25), ⅛ (.125), etc. In the

example above, .5 is represented in binary by .1. When you multiply 0.1 and 0.1, the result will

be two bits, 0.01 (binary), or .25 (decimal)

I like to handle fractional bits by declaring the fractional bits in the negative range of the

bus. For example, wire [11:-4] foo, has 12 integer bits, and 4 fractional bits. When foo is

multiplied by itself, it produces 24 integer bits, and 8 fractional bits, or [23:-8]. However, if

[-1:-4] bus is multiplied by a [11:-4] bus, the result is only a [11:-8] bus.

Here are a few good tutorials:

From Berkeley: https://inst.eecs.berkeley.edu/~cs61c/sp06/handout/fixedpt.html

From UW: https://courses.cs.washington.edu/courses/cse467/08au/labs/l5/fp.pdf

● sinusoid: Using your ram_1r1w_async memory, create a module that generates a sine

wave, turning hexadecimal indices into (signed) 12-bit values. See the demo below for

more information.

Since this is an audial challenge, there is no testbench for this part. If you need

accommodations, please see the instructor. Commit your memory initialization file (in

hex format) along with your solution.

Demonstration (All Students):

Demonstrate using your counter module from Lab Assignment 1/2, and your sinusoid

module above, play a Tuning-A tone on the speakers in the lab with the PMOD I2S2 module.

You need to figure out how to generate a tone at 440 Hz, given that the PLL clock runs at

22.591MHz, and the I2S2 accepts a Left channel and a Right channel output at approximately

44.1 KHz. The interface to the I2S2 module is Ready-Valid-&. (Note: Do not use the output of

your counter as your clock. All of your logic should run at 22.591MHz.). Implement your solution

in sinusoid/top.sv. We will use this link (or similar) to determine if you have succeeded.

The clock frequency in this lab has changed. The signal from the PLL is faster,

22.591MHz, and called clk_o.

In both demonstration folders, top.sv instantiates an I2S2-to-AXI-Streaming module,

which drives the I2S2 PMOD. The input and output of this module uses a ready/valid

handshake. The left and right audio channels are separate wires, but you can concatenate them

if you would like (for your FIFO). Drive both, for your sinusoid.

top.sv “works out of the box”. You can test your setup works by connecting your

computer to the audio input and connecting the audio output into amplified speakers, i.e. those

with a power cable. You will need to instantiate your logic between the interfaces for the demo.

WARNING WARNING WARNING

DO NOT PLUG YOUR HEADPHONES INTO THE AUDIO OUTPUT WHILE THEY ARE IN

YOUR EARS. PLAY MUSIC FIRST, ADJUST VOLUME, THEN PUT IN EARS.

Use make bitstream to build the FPGA bitstream (configuration file) and program the

FPGA. Your FPGA will need to be plugged into a USB port.

Grading:

1. Push your completed assignment to your git repository. Only push your modified files!

2. Submit your assignment through gradescope, and confirm that the autograder runs.

3. Demonstrate each part to a TA.

This lab will be graded on the following criteria. Weights are available in Canvas.

1. Correctness: Is the code in git correct? Does it pass the checks in Gradescope?

2. Lint and Style: Does the solution pass the Verilog Lint Checker run by Gradescope? Are

variable names consistent with what is being taught in class? This may seem pedantic,

but in industry and open source projects this is standard practice.

Hint: use the make lint command to check your code

3. Demonstration: Was the code demonstrated to a TA or instructor before the deadline?

The following will also be considered in your final grade:

4. Language Features: Does the solution use allowed language features? (i.e. Structural vs

Behavioral Verilog). Maximum 50% deduction.

5. Git Hygiene: Does the assignment submission only contain files that are relevant to the

assignment? Please, please, please don’t check files that aren’t part of the submission.

Maximum 20% deduction.

Finally, modifying any parts of the test/grading infrastructure without permission will

result in a zero on the entire part.


相关文章

【上一篇】:到头了
【下一篇】:没有了

版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp