Abstract#
Using Quartus software and Verilog language, a structured behavioral description method was employed to complete the design and simulation of a single-cycle CPU model, achieving normal operation of the model.
Keywords: Overall, System
Chapter 1 Principles and Model Design#
1.1 Principles Related to the Experiment#
Von Neumann Computer Working Principle#
- The computer consists of five main parts: controller, arithmetic unit, memory, input devices, and output devices.
- Programs and data are stored in memory in binary code form without distinction, with the storage location determined by the address.
- The controller works according to the instruction sequence (program) stored in memory, and the execution of instructions is controlled by a program counter.
-
Single-cycle CPU: Refers to the execution of an instruction completed within one clock cycle, after which the next instruction execution begins, meaning one instruction is completed in one clock cycle.
-
Instruction set: Refers to the collection of all instructions of a computer.
-
Instruction cycle: The time required from fetching the instruction, analyzing the instruction, to executing the instruction; different instructions may have different instruction cycle lengths.
-
Steps for a single-cycle CPU to process instructions:
Fetch instruction -> Instruction decode -> Instruction execution -> Memory access -> Result write back
1.2 Model Design#
Structural Block Diagram#
Control Circuit Diagram#
Corresponding Modules and Functions#
InstructionMemory: Instruction memory, retrieves the corresponding instruction from memory based on the input address.
CU: Control unit, analyzes instructions, determines what operations should be performed, and issues control signals to the corresponding components according to the determined timing.
Register: Data register (Figure 1.2.4), responsible for temporarily storing the arithmetic data from the ALU and the data fetched from and written to memory, acting as a cache, since reading and writing cannot occur simultaneously within a single cycle, a level cache design is included.
ALU: Arithmetic Logic Unit (Figure 1.2.5), performs corresponding arithmetic and logical operations based on the input opcode and data.
memRam: Data memory (Figure 1.2.6), stores data.
PC: Program counter, performs value fetching operations, completes the execution of the first instruction, and then retrieves the address of the second instruction based on the PC. The address in the PC is automatically incremented or given by the branch instruction for the next instruction's address.
Chapter 2 Principles and Model Design#
2.1 Opcode Format#
This experiment uses fixed-length encoding, with machine instructions encoded in 16 bits, thus the following opcodes are designed based on actual needs:
`define ADD 4'b0000 // Arithmetic addition
`define INC 4'b0001 // Arithmetic add 1
`define NOT 4'b0010 // Logical NOT
`define AND 4'b0011 // Logical AND
`define OR 4'b0100 // Logical OR
`define SLF 4'b0101 // Data left shift
`define SRF 4'b0110 // Data right shift
`define JMP 4'b0111 // Unconditional jump
`define STO 4'b1000 // Write to memory, store data
`define LAD 4'b1001 // Read from memory, fetch data
`define MOV 4'b1010 // Data transfer
`define HAT 4'b1111 // Halt
2.2 Instructions and Formats#
=> Arithmetic Operation Instructions#
(1) ADD rd, rs, rt
0000 | rd (4 bits) | rs (4 bits) | rt (4 bits) |
---|---|---|---|
Function: rd <- rs + rt (Arithmetic addition)
(2) INC rt, rs
0001 | rt (4 bits) | rs (4 bits) | 0000 (unused) |
---|---|---|---|
Function: rt <- rs (Arithmetic add 1)
=> Logical Operation Instructions#
(3) NOT rt, rs
0010 | rt (4 bits) | rs (4 bits) | 0000 (unused) |
---|---|---|---|
Function: rt <- !rs (Logical NOT operation)
(4) AND rd, rs, rt
0011 | rd (4 bits) | rs (4 bits) | rt (4 bits) |
---|---|---|---|
Function: rd <- rs & rt (Logical AND operation)
(5) OR rd, rs, rt
0100 | rd (4 bits) | rs (4 bits) | rt (4 bits) |
---|---|---|---|
Function: rd <- rs | rt (Logical OR operation)
=> Shift Instructions#
(6) SLF rd, rs, rt
0101 | rd (4 bits) | rs (4 bits) | rt (4 bits) |
---|---|---|---|
Function: rd <- rs << rt (Left shift)
(7) SRF rd, rt, rs
0110 | rd (4 bits) | rs (4 bits) | rt (4 bits) |
---|---|---|---|
Function: rd <- rs >> rt (Right shift)
=> Memory Read/Write Instructions#
(8) STO rt, rs
1000 | rt (3 bits) | 0 (unused) | rs (4 bits) | 0000 (unused) |
---|---|---|---|---|
Function: Write data from the register to the data memory
(9) LAD rt, rs
1001 | rt (4 bits) | rs (3 bits) | 00000 (unused) |
---|---|---|---|
Function: Read data from data memory into the register
=> Unconditional Jump Instructions#
(10) JMP
0111 | 0000 (unused) | Jump instruction address (8 bits) |
---|---|---|
Function: Jump to the specified instruction address
=> Halt Instruction#
(11) HLT
1111 | 000000000000 (unused) |
---|---|
Function: Halt, the value of PC remains unchanged
Chapter 3 Model Implementation and Testing#
3.1 Verilog Program Design#
headfile.v#
`ifndef HEADFILE_H_
`define ADD 4'b0000 // Arithmetic addition
`define INC 4'b0001 // Arithmetic add 1
`define NOT 4'b0010 // Logical NOT
`define AND 4'b0011 // Logical AND
`define OR 4'b0100 // Logical OR
`define SLF 4'b0101 // Data left shift
`define SRF 4'b0110 // Data right shift
`define JMP 4'b0111 // Unconditional jump
`define STO 4'b1000 // Write to memory, store data
`define LAD 4'b1001 // Read from memory, fetch data
`define MOV 4'b1010 // Data transfer
`define HAT 4'b1111 // Halt
`define rg0 4'b0000 // Register 0
`define rg1 4'b0001 // Register 1
`define rg2 4'b0010 // Register 2
`endif
alu.v#
`timescale 1ns / 1ps
`include "headfile.v"
// ALU, performs logical and arithmetic operations
module alu(op, a, b, n, f);
input [3:0] op, n;
input [7:0] a, b;
output [7:0] f;
reg [7:0] f;
always@(*)
begin
case(op)
`ADD: f = a + b;
`INC: f = a + 1;
`NOT: f = ~a;
`AND: f = a & b;
`OR: f = a | b;
`SLF: f = a << n;
`SRF: f = a >> n;
default: f = 8'b00000000;
endcase
end
endmodule
memRam.v#
`timescale 1ns / 1ps
`include "headfile.v"
// Memory
module memRam(data, wren, address, inclock, outclock, q);
parameter wordsize = 8;
parameter memsize = 8;
parameter addr = 3; // 3-bit address line
input [wordsize-1:0] data;
input [addr-1:0] address;
input wren, inclock, outclock;
output [wordsize-1:0] q;
reg [wordsize-1:0] q;
reg [wordsize-1:0] ram [memsize-1:0];
integer i;
initial
begin // Initialization
for(i=0; i<8; i=i+1)
ram[i] = 8'b00000000;
ram[0] = 8'b00000010; // Write 2 to position 0
end
always@(posedge inclock) // Triggered on the rising edge of inclock
begin
if(~wren)
ram[address] = data; // When wren is low, write data to the corresponding address
end
always@(posedge outclock) // Triggered on the rising edge of outclock
begin
if(wren)
q = ram[address]; // When wren is high, read data from the corresponding address
end
endmodule
Register.v#
`timescale 1ns / 1ps
`include "headfile.v"
// Register
module Register(clk, data, wren, inaddr, outaddr1, outaddr2,
regtoalu1, regtoalu2, regtomemaddr, regtomem,
memtoregwren, memtoregaddr, memtoregdata);
input [7:0] data;
input [3:0] inaddr, outaddr1, outaddr2, regtomemaddr, memtoregaddr;
input wren, clk, memtoregwren;
output [7:0] regtoalu1, regtoalu2, regtomem, memtoregdata;
reg [7:0] regmem [15:0];
reg lwren, lmemtoregwren;
reg [3:0] linaddr, lmemtoregaddr;
reg [7:0] ldata, lmemtoregdata;
integer i;
initial
begin // Initialization
lwren = 1'b0;
lmemtoregwren = 1'b0;
for(i=0; i<16; i=i+1)
regmem[i] = 8'b00000000;
end
always@(posedge clk) // Cache
begin
lwren <= wren;
linaddr <= inaddr;
ldata <= data;
lmemtoregwren <= memtoregwren;
lmemtoregaddr <= memtoregaddr;
lmemtoregdata <= memtoregdata;
end
always@(*)
begin
if(lwren)
regmem[linaddr] <= ldata; // Write data to the corresponding address
if(lmemtoregwren)
regmem[lmemtoregaddr] <= lmemtoregdata;
end
assign regtoalu1 = regmem[outaddr1]; // Fetch value from register
assign regtoalu2 = regmem[outaddr2];
assign regtomem = regmem[regtomemaddr];
endmodule
InstructionMemory.v#
`timescale 1ns / 1ps
`include "headfile.v"
// Instruction storage
module InstructionMemory(A, RD);
input [7:0] A;
output [15:0] RD;
reg [15:0] IM [29:0];
assign RD = IM[A]; // Immediately fetch content based on address
// After successful execution, the data from positions 1 to 7 in memory should be: 2, 3, 5, 2, 3, 253, 250, 126
initial begin
IM[0] = {`LAD, `rg0, 3'b000, 5'b00000}; // Read data from memory position 0 into register rg0, rg0 = 2
/*-----------------------------------------------------------------*/
IM[1] = {`INC, `rg1, `rg0, 4'b0000}; // Add 1 to the data in register rg0 and move the result to rg1, rg1 = rg0 + 1 = 2 + 1 = 3
IM[2] = {`STO, 3'b001, 1'b0, `rg1, 4'b0000}; // Store the data in register rg1 into memory position 1, 3
/*-----------------------------------------------------------------*/
IM[3] = {`ADD, `rg2, `rg1, `rg0}; // Add the numbers in registers rg0 and rg1 and store the result in rg2, rg2 = rg0 + rg1 = 2 + 3 = 5
IM[4] = {`STO, 3'b010, 1'b0, `rg2, 4'b0000}; // Store the data in register rg2 into memory position 2, 5
/*-----------------------------------------------------------------*/
IM[5] = {`JMP, 4'b0000, 8'b00000111}; // Jump to the seventh instruction
IM[6] = {`HAT, 12'b000000000000}; // If the jump is unsuccessful, it will halt
/*-----------------------------------------------------------------*/
IM[7] = {`AND, `rg2, `rg1, `rg0}; // Perform AND operation on the numbers in registers rg0 and rg1 and store in rg2, rg2 = rg1 & rg0 = 00000011 & 00000010 = 00000010 (2)
IM[8] = {`STO, 3'b011, 1'b0, `rg2, 4'b0000}; // Store the data in register rg2 into memory position 3, 2
/*-----------------------------------------------------------------*/
IM[9] = {`OR, `rg2, `rg1, `rg0}; // Perform OR operation on the numbers in registers rg0 and rg1 and store in rg2, rg2 = rg1 | rg0 = 00000011 | 00000010 = 00000011 (3)
IM[10] = {`STO, 3'b100, 1'b0, `rg2, 4'b0000}; // Store the data in register rg2 into memory position 4, 3
/*-----------------------------------------------------------------*/
IM[11] = {`NOT, `rg2, `rg0, 4'b0000}; // Perform NOT operation on the data in register rg0 and store the result in rg2, rg2 = ~rg0 = ~00000010 = 11111101 (253)
IM[12] = {`STO, 3'b101, 1'b0, `rg2, 4'b0000}; // Store the data in register rg2 into memory position 5, 253
/*-----------------------------------------------------------------*/
IM[13] = {`SLF, `rg0, `rg2, 4'b0001}; // Left shift the data in rg2 by one position and store the result in rg0, rg0 = rg2 << 1 = 11111101 << 1 = 11111010 (250)
IM[14] = {`STO, 3'b110, 1'b0, `rg0, 4'b0000}; // Store the data in register rg0 into memory position 6, 250
/*-----------------------------------------------------------------*/
IM[15] = {`SRF, `rg1, `rg2, 4'b0001}; // Right shift the data in rg2 by one position and store the result in rg1, rg1 = rg2 >> 1 = 11111101 >> 1 = 01111110 (126)
IM[16] = {`STO, 3'b111, 1'b0, `rg1, 4'b0000}; // Store the data in register rg2 into memory position 7, 126
/*-----------------------------------------------------------------*/
IM[17] = {`HAT, 12'b000000000000}; // Halt
IM[18] = 16'b0000000000000000;
IM[19] = 16'b0000000000000000;
IM[20] = 16'b0000000000000000;
IM[21] = 16'b0000000000000000;
IM[22] = 16'b0000000000000000;
IM[23] = 16'b0000000000000000;
IM[24] = 16'b0000000000000000;
IM[25] = 16'b0000000000000000;
IM[26] = 16'b0000000000000000;
IM[27] = 16'b0000000000000000;
IM[28] = 16'b0000000000000000;
IM[29] = 16'b0000000000000000;
end
endmodule
CU.v#
`timescale 1ns / 1ps
`include "headfile.v"
// Control the distribution of data under different instructions
module CU(
input [15:0] instr,
output enable,
output reg [3:0] regoutaddr1,
output reg [3:0] regoutaddr2,
output reg [3:0] reginaddr,
output reg [3:0] regtomemaddr,
output reg [3:0] memtoregaddr,
output reg [3:0] aluop,
output reg [3:0] alun,
output reg memwren,
output reg memtoregwren,
output reg [2:0] memaddr,
output reg [7:0] pcnextaddr,
output reg pcnext,
output reg pcflag,
output reg regwren);
wire [3:0] op;
assign op = instr[15:12];
initial
begin
regwren = 1'b0;
memtoregwren <= 1'b0;
memwren = 1'b1;
pcnext = 1'b0;
pcflag = 1'b0;
end
always@(*)
begin
if((op == `ADD)||(op == `AND)||(op == `OR))
begin
aluop <= instr[15:12];
regoutaddr1 <= instr[3:0];
regoutaddr2 <= instr[7:4];
regwren <= 1'b1;
reginaddr <= instr[11:8];
end
else if((op == `SLF)||(op == `SRF))
begin
aluop <= instr[15:12];
alun <= instr[3:0];
regoutaddr1 <= instr[7:4];
regwren <= 1'b1;
reginaddr <= instr[11:8];
end
else if((op == `INC)||(op == `NOT))
begin
aluop <= instr[15:12];
regoutaddr1 <= instr[7:4];
regwren <= 1'b1;
reginaddr <= instr[11:8];
end
else if((op == `STO))
begin
regtomemaddr <= instr[7:4];
memaddr <= instr[11:9];
memwren <= 1'b0;
end
else if((op == `LAD))
begin
memaddr <= instr[7:5];
memwren <= 1'b1;
memtoregaddr <= instr[11:8];
memtoregwren <= 1'b1;
end
else
begin
regwren <= 1'b0;
memtoregwren <= 1'b0;
//memwren <= 1'b1;
end
end
always@(*)
begin
if((op == `JMP))
begin
pcnextaddr <= instr[7:0];
pcnext <= 1'b1;
pcflag <= 1'b1;
end
else
pcnext <= 1'b0;
end
assign enable = ~(op == `HAT);
endmodule
CPU_top.v#
`timescale 1ns / 1ps
// Top-level module for connecting various modules
module CPU_top(
input clk,
input reset,
output [7:0] OPT_PC
);
reg [7:0] PC;
wire [15:0] instr;
wire [7:0] aluout;
wire [3:0] alun;
wire [3:0] aluop;
wire regwren, enable, memwren, memtoregwren, pcnext, pcflag;
wire [2:0] memaddr;
wire [3:0] memtoregaddr;
wire [3:0] Reginaddr;
wire [3:0] Regoutaddr1;
wire [3:0] Regoutaddr2;
wire [3:0] regtomemaddr;
wire [7:0] Registerout1;
wire [7:0] Registerout2;
wire [7:0] memtoregdata;
wire [7:0] regtomem;
wire [7:0] NextPC;
wire [7:0] pcnextaddr;
initial begin
PC = 8'b00000000;
end
InstructionMemory IM(
.A(PC),
.RD(instr)
);
CU m0(
.instr(instr),
.enable(enable),
.regoutaddr1(Regoutaddr1),
.regoutaddr2(Regoutaddr2),
.reginaddr(Reginaddr),
.regtomemaddr(regtomemaddr),
.memtoregaddr(memtoregaddr),
.aluop(aluop),
.alun(alun),
.memwren(memwren),
.memtoregwren(memtoregwren),
.memaddr(memaddr),
.pcnextaddr(pcnextaddr),
.pcnext(pcnext),
.pcflag(pcflag),
.regwren(regwren)
);
Register R(
.clk(clk),
.data(aluout),
.wren(regwren),
.inaddr(Reginaddr),
.outaddr1(Regoutaddr1),
.outaddr2(Regoutaddr2),
.regtoalu1(Registerout1),
.regtoalu2(Registerout2),
.regtomemaddr(regtomemaddr),
.regtomem(regtomem),
.memtoregwren(memtoregwren),
.memtoregaddr(memtoregaddr),
.memtoregdata(memtoregdata)
);
alu A(
.op(aluop),
.a(Registerout1),
.b(Registerout2),
.n(alun),
.f(aluout)
);
memRam M(
.data(regtomem),
.wren(memwren),
.address(memaddr),
.inclock(clk),
.outclock(clk),
.q(memtoregdata)
);
assign NextPC = (pcnext) ? pcnextaddr : (PC + 1'b1); // Determine whether to jump to the specified address or execute the next instruction
always@(posedge clk) // Timing, PC value changes once per cycle
begin
if(reset)
PC <= 0;
else
begin
if(enable)
PC <= NextPC;
else
PC <= PC; // Halt instruction, PC value remains unchanged
end
end
assign OPT_PC = PC;
endmodule
3.2 Test Program#
CPU_test.v#
`timescale 1ns / 1ps
module CPU_test(OPT_PC);
output [7:0] OPT_PC;
reg clk;
reg reset;
CPU_top uut(
.clk(clk),
.reset(reset),
.OPT_PC(OPT_PC)
);
// After successful execution, the data from positions 1 to 7 in memory should be: 2, 3, 5, 2, 3, 253, 250, 126
initial begin
clk = 0;
reset = 1; // Initialize CPU
#100;
reset = 0;
$display(" pc: instr : ALUR :rg0:rg1:rg2: m0: m1: m2: m3: m4: m5: m6: m7");
$monitor("%d:%b:%b:%d:%d:%d:%d:%d:%d:%d:%d:%d:%d:%d",
uut.PC, uut.instr, uut.aluout, uut.R.regmem[0], uut.R.regmem[1], uut.R.regmem[2],
uut.M.ram[0], uut.M.ram[1], uut.M.ram[2], uut.M.ram[3],
uut.M.ram[4], uut.M.ram[5], uut.M.ram[6], uut.M.ram[7] );
#2000 $stop;
end
always #50 clk = ~clk;
endmodule
3.3 Analysis of the Execution Process of the Model#
Simulation Waveform Obtained#
Console Output#
Instruction Analysis#
-
Instruction 0: Read data from memory position 0 into register rg0, rg0=2, memory position 0 is 2
-
Instruction 1: Add 1 to the data in register rg0 and move the result to rg1, rg1 = rg0+1=2+1=3
-
Instruction 2: Store the data in register rg1 into memory position 1, memory position 1 is 3
-
Instruction 3: Add the numbers in registers rg0 and rg1 and store the result in rg2, rg2=rg0+rg1=2+3=5
-
Instruction 4: Store the data in register rg2 into memory position 2, memory position 2 is 5
-
Instruction 5: Jump to the seventh instruction
-
Instruction 6: If the jump is unsuccessful, it will halt
-
Instruction 7: Perform AND operation on the numbers in registers rg0 and rg1 and store in rg2, rg2=00000011&00000010=00000010
-
Instruction 8: Store the data in register rg2 into memory position 3, memory position 3 is 2
-
Instruction 9: Perform OR operation on the numbers in registers rg0 and rg1 and store in rg2, rg2=00000011|00000010=00000011
-
Instruction 10: Store the data in register rg2 into memory position 4, memory position 4 is 3
-
Instruction 11: Perform NOT operation on the data in register rg0 and store the result in rg2, rg2=~00000010=11111101 (253)
-
Instruction 12: Store the data in register rg2 into memory position 5, memory position 5 is 253
-
Instruction 13: Left shift the data in rg2 by one position and store the result in rg0, rg0=rg2<<1=11111101<<1=11111010 (250)
-
Instruction 14: Store the data in register rg0 into memory position 6, memory position 6 is 250
-
Instruction 15: Right shift the data in rg2 by one position and store the result in rg1, rg1=rg2>>1=11111101>>1=01111110 (126)
-
Instruction 16: Store the data in register rg2 into memory position 7, memory position 7 is 126
-
Instruction 17: Halt
Result Analysis#
- Instructions 0-2: Test arithmetic addition, calculate 2+1=3, store 3 in memory position 1
- Instructions 3-4: Test addition, calculate 2+3=5, store 5 in memory position 2
- Instructions 5-6: Test jump instruction
- Instructions 7-8: Test AND operation, calculate 00000011 & 00000010, store 2 in memory position 3
- Instructions 9-10: Test OR operation, calculate 00000011 | 00000010, store 3 in memory position 4
- Instructions 11-12: Test NOT operation, calculate ~00000010, store 253 in memory position 5
- Instructions 13-14: Test left shift, store 250 in memory position 6
- Instructions 15-16: Test right shift, store 126 in memory position 7
From the waveform diagram (Figure 3.3.1), it can be seen that the jump from instruction 5 to instruction 7 was successful, and after instruction 17, the value of PC remains unchanged each cycle.
From the console output (Figure 3.3.2), the values in memory are consistent with the analysis of the instructions: 2, 3, 5, 2, 3, 253, 250, 126.