Tuesday, August 6, 2013

Instruction Execution Cycle in CPU



http://en.wikipedia.org/wiki/Instruction_cycle
Instruction cycle
An instruction cycle (sometimes called fetch-and-execute cycle, fetch-decode-execute cycle, or FDX) is the basic operation cycle of a computer. It is the process by which a computer retrieves a program instruction from its memory, determines what actions the instruction requires, and carries out those actions. This cycle is repeated continuously by the central processing unit (CPU), from bootup to when the computer is shut down.
http://upload.wikimedia.org/wikipedia/commons/thumb/5/52/Comp_fetch_execute_cycle.png/220px-Comp_fetch_execute_cycle.png
http://bits.wikimedia.org/static-1.22wmf10/skins/common/images/magnify-clip.png
A diagram of the Fetch Execute Cycle.
Contents
Circuits Used
The circuits used in the CPU during the cycle are:
  • Program counter (PC) - an incrementing counter that keeps track of the memory address of the instruction that is to be executed next.
  • Memory address register (MAR) - holds the address of a memory block to be read from or written to.
  • Memory data register (MDR) - a two-way register that holds data fetched from memory (and ready for the CPU to process) or data waiting to be stored in memory
  • Instruction register (IR) - a temporary holding ground for the instruction that has just been fetched from memory
  • Control unit (CU) - decodes the program instruction in the IR, selecting machine resources such as a data source register and a particular arithmetic operation, and coordinates activation of those resources
  • Arithmetic logic unit (ALU) - performs mathematical and logical operations
Each computer's CPU can have different cycles based on different instruction sets, but will be similar to the following cycle:
1. Fetching the instruction
The next instruction is fetched from the memory address that is currently stored in the
program counter (PC), and stored in the instruction register (IR). At the end of the fetch operation, the PC points to the next instruction that will be read at the next cycle.
2. Decode the instruction
The decoder interprets the instruction. During this cycle the instruction inside the IR (instruction register) gets decoded.
3.In case of a memory instruction (direct or indirect) the execution phase will be in the next clock pulse.
If the instruction has an
indirect address, the effective address is read from main memory, and any required data is fetched from main memory to be processed and then placed into data registers(Clock Pulse: T3). If the instruction is direct, nothing is done at this clock pulse. If this is an I/O instruction or a Register instruction, the operation is performed (executed) at clock Pulse.
4. Execute the instruction
The control unit of the CPU passes the decoded information as a sequence of control signals to the relevant function units of the CPU to perform the actions required by the instruction such as reading values from registers, passing them to the ALU to perform mathematical or logic functions on them, and writing the result back to a register. If the ALU is involved, it sends a condition signal back to the CU.
The result generated by the operation is stored in the main memory, or sent to an output device. Based on the condition of any feedback from the ALU, Program Counter may be updated to a different address from which the next instruction will be fetched.
The cycle is then repeated.
Initiating the cycle
The cycle starts immediately when power is applied to the system using an initial PC value that is predefined for the system architecture (in Intel IA-32 CPUs, for instance, the predefined PC value is 0xfffffff0). Typically this address points to instructions in a read-only memory (ROM) which begin the process of loading the operating system. (That loading process is called booting.) [1]
Fetch cycle
Step 1 of the Instruction Cycle is called the Fetch Cycle. These steps are the same for each instruction. The fetch cycle processes the instruction from the instruction word which contains an opcode.
Decode
Step 2 of the instruction Cycle is called the decode. The opcode fetched from the memory is being decoded for the next steps and moved to the appropriate registers...
Read the effective address
Step 3 is deciding which operation it is. If this is a Memory operation - in this step the computer checks if it's a direct or indirect memory operation:
  • Direct memory instruction - Nothing is being done.
  • Indirect memory instruction - The effective address is being read from the memory.
If this is a I/O or Register instruction - the computer checks its kind and executes the instruction.
Execute cycle
Step 4 of the Instruction Cycle is the Execute Cycle. These steps will change with each instruction.
Data is transferred between the CPU and the I/O module. Next arithmetic and logical operations given in the instructions are executed on the data, as well as other instructions such as jumping to another location on the program counter.
The Fetch-Execute cycle in Transfer Notation
MAR\gets [PC]
MDR\gets [Memory]_{MAR address}; PC\gets [PC]+1(Increment the PC for next cycle at the same time)
IR\gets [MDR]

The registers used above, besides the ones described earlier, are the Memory Address Register (MAR) and the Memory Data Register (MDR), which are used (at least conceptually) in the accessing of memory. Often, the MDR is expressed as the MBR (Memory Buffer Register).
Fetch and execute example (written in RTL - Register Transfer Language):
PC=0x5AF, AC=0x7EC3, M[0x5AF]=0x932E, M[0x32E]=0x09AC, M[0x9AC]=0x8B9F.
T0 : AR = 0x5AF (PC)
T1 : IR = 0x932E (M[AR]), PC=0x5B0 (PC + 1)
T2 : DECODE (IR) = ADD opCode, AR=0x32E, I=1 (Indirect instruction)
T3 : AR = 0x9AC (M[AR])
T4 : DR = 0x8B9F (M[AR])
T5 : AC = 0x8B9F + 0x7EC3 = 0x0A62, E = 1 (carry out), SC = 0
Summary: this is an example for an ADD Instruction which makes use of Register Indirect addressing. The steps T0 to T5 have the following meaning:
T0-T1 : Fetch operation
T2 : Decode operation
T3-T4 : Indirect Memory reference
T5 : Execute ADD operation





http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Overall/steps.html

What are the Steps to Execute an Instruction?

© 2003 by Charles C. Lin. All rights reserved.

Six Steps

The main purpose of a CPU is to execute instructions. We've already seen some simple examples of instructions, i.e., add and addi.
The CPU executes the binary representation of the instructions, i.e., machine code.
Since programs can be very large, and since CPUs have limited memory, programs are stored in memory (RAM). However, CPUs do its processing on the CPU. So, the CPU must copy the instruction from memory to the CPU, and once it's in the CPU, it can execute it.
The PC is used to determine which instruction is executed, and based on this execution, the PC is updated accordingly to the next instruction to be run.
Essentially, a CPU repeatedly fetches instructions and executes them.
The following is a summary of the six steps used to execute a single instruction.
·         Step 1: Fetch instruction
For some reason, the verb "fetch" is always used with instruction. We don't "get an instruction" or "retrieve an instruction". We "fetch an instruction".
To fetch an instruction involves the following steps:
·         CPU must place an address to the MAR.
·         CPU must activate the tri-state buffer so MAR contents are placed on the address bus.
·         CPU sends R/\W = 1 and CE = 1 to memory, to indicate it wants to do a read.
·         Memory eventually puts instruction on the data bus.
·         Memory sends ACK = 1.
·         CPU loads the instruction to the MDR.
·         CPU transfers instruction from MDR to IR.
·         CPU sets CE = 0 to memory indicate it's done with fetching the instruction.
As you can see, the steps are rather involved. You can speed up this step if you assume instructions are in a fast instruction cache. For now, we won't assume that.
You should go back to the notes on memory if you have forgotten how it works, in particular, if you have forgotten the control signals used by memory.
·         Step 2: Decode instruction and Fetch Operands
In the second step, the bits used for the opcode (and function, for R-type instructions) are used to determine how the instruction should be executed. This is what is meant by "decoding" the instruction.
Recall that operands are arguments to the assembly instruction.
However, since R-type and I-type instructions both use registers, and those registers are in specific locations of the instruction, we can begin to fetch the values within the registers at the same time we are decoding.
In particular, we're going to do the following:
·         Get IR31-26, the opcode
·         Get IR25-21, which is $rs, the first source register.
·         Get IR20-16, which is $rt, the second source register.
·         Get IR15-11, which is $rd, the destination register.
·         Get IR15-0, the immediate value
·         Get IR5-0, the function code
You'll notice that we're extracting these bits directly from the instruction register.
You'll also notice that we extracted IR15-11 and IR15-0. How can we do both? Well, they're merely wires, so there's no reason you can't get both quantitie out.
The key is to realize that sometimes we use IR15-11 and sometimes we use IR15-0. We need to have both of them ready because this is hardware. It's easier to have everything we need, and then figure out what we need, than to decide what we need and try to get it.
In particular, when we fetch the operands (i.e., the registers) we want to send the source and destination registers bits to a device called the register file.
For example, if IR25-21 has value 00111, this means we want register $r7 from the register file. We sent in 00111 to this circuit, and it returns the contents back to us.
We'll be discussing the register file soon.
If we are executing an I-type instruction, then typically, we'll sign-extend (or zero-extend, depending on the instruction) the immediate part (i.e., IR15-0) to 32 bits.
·         Step 3: Perform ALU operation
The ALU has two 32-bit data inputs. It has a 32-bit output. The purpose of the ALU is to perform a computation on the two 32-bit data inputs, such as adding the two values. There are some control bits on the ALU. These control bits specify what the ALU should do.
For example, they may specify an addition, or a subtraction, or a bitwise AND.
Where do the input values of the ALU come from?
Recall that an instruction stores information about its operands. In particular, it encodes registers as 5-bit UB numbers. These register encodings are sent to the register file as inputs.
The register file then outputs the 32-bit values of these registers. These are the sent as inputs to the ALU.
·         Step 4: Access memory
There are only two kind of instructions that access memory: load and store.
load copies a value from memory to a register. store copies a register value to memory.
Any other instruction skips this step.
·         Step 5: Write back result to register file
At this point, the output of the ALU is written back to the register file. For example, if the instruction was: add $r2, $r3, $r4 then the result of adding the contents of $r3 to the contents of $r4 would be stored back into $r2.
The result could also be due to a load from memory.
Some instructions don't have results to store. For example, branch and jump instructions do not have any results to store.
·         Step 6: Update the PC
Finally, we need to update the program counter. Typically, we perform the following update:
PC <- PC + 4
Recall that PC holds the current address of the instruction to be executed. To update it means to set the value of this register to the next instruction to be executed.
Unless the instruction is a branch or jump, the next instruction to execute is the next instruction in memory. Since each instruction takes up 4 bytes of memory, then the next address in memory is PC + 4, which is the address of the current instruction plus 4.
The PC might change to some other address if there is a branch or jump.
These are the six steps to executing an instruction. Not every instruction goes through every step. However, we label each step so that you can be aware they exist.
Some of these steps may not make much sense now, but hopefully, they're be clearer once we start implementing the steps in depth.

Six Steps, In Summary

To make it easier to read, the six steps are listed below.
Step
Description
1
Fetch Instruction from Memory
2
Decode Instruction and Fetch Operands
3
Perform ALU Operations
4
Memory Access (for load/store)
5
Store ALU result to register file
6
Update PC





http://courses.cs.vt.edu/csonline/MachineArchitecture/Lessons/CPU/Lesson.html
The heart of a computer is the central processing unit or CPU. This device contains all the circuitry that the computer needs to manipulate data and execute instructions. The CPU is amazingly small given the immense amount of circuitry it contains. We have already seen that the circuits of a computer are made of gates. Gates, however are also made of another tiny component called a transistor, and a modern CPU has millions and millions of transistors in its circuitry. The image to the right [Intel 2000] shows just how compact a CPU can be. The CPU is a Pentium III processor for mobile PCs.
The CPU is composed of five basic components: RAM, registers, buses, the ALU, and the Control Unit. Each of these components are pictured in the diagram below. The diagram shows a top view of a simple CPU with 16 bytes of RAM. To better understand the basic components of the CPU, we will consider each one in detail.
·         RAM: this component is created from combining latches with a decoder. The latches create circuitry that can remember while the decoder creates a way for individual memory locations to be selected.
·         http://courses.cs.vt.edu/csonline/MachineArchitecture/Lessons/CPU/cpu_circuit.gifRegisters: these components are special memory locations that can be accessed very fast. Three registers are shown: the Instruction Register (IR), the Program Counter (PC), and the Accumulator.
·         Buses: these components are the information highway for the CPU. Buses are bundles of tiny wires that carry data between components. The three most important buses are the address, the data, and the control buses.
·         ALU: this component is the number cruncher of the CPU. The Arithmetic / Logic Unit performs all the mathematical calculations of the CPU. It is composed of complex circuitry similar to the adder presented in the previous lesson. The ALU, however, can add, subtract, multiply, divide, and perform a host of other calculations on binary numbers.
·         Control Unit: this component is responsible for directing the flow of instructions and data within the CPU. The Control Unit is actually built of many other selection circuits such as decoders and multiplexors. In the diagram above, the Decoder and the Multiplexor compose the Control Unit.
In order for a CPU to accomplish meaningful work, it must have two inputs: instructions and data. Instructions tell the CPU what actions need to be performed on the data. We have already seen how data is represented in the computer, but how do we represent instructions? The answer is that we represent instructions with binary codes just like data. In fact, the CPU makes no distinction about the whether it is storing instructions or data in RAM. This concept is called the stored-program concept. Brookshear [1997] explains:
"Early computing devices were not known for their flexibility, as the program that each device executed tended to be built into the control unit as a part of the machine...One approach used to gain flexibility in early electronic computers was to design the control units so they could be conveniently rewired. A breakthrough came with the realization that the program, just like data, can be coded and stored in main memory. If the control unit is designed to extract the program from memory, decode the instructions, and execute them, a computer's program can be changed merely by changing the contents of the computer's memory instead of rewiring the control unit. This stored-program concept has become the standard approach used today. To apply it, a machine is designed to recognize certain bit patterns as representing certain instructions. This collection of instructions along with the coding system is called the machine-language because it defines the means by which we communicate algorithms to the machine."
Thus both inputs to the CPU are stored in memory, and the CPU functions by following a cycle of fetching an instruction, decoding it, and executing it. This process is known as the fetch-decode-execute cycle. The cycle begins when an instruction is transferred from memory to the IR along the data bus. In the IR, the unique bit patterns that make up the machine-language are extracted and sent to the Decoder. This component is responsible for the second step of the cycle, that is, recognizing which operation the bit pattern represents and activating the correct circuitry to perform the operation. Sometimes this involves reading data from memory, storing data in memory, or activating the ALU to perform a mathematical operation. Once the operation is performed, the cycle begins again with the next instruction. The CPU always knows where to find the next instruction because the Program Counter holds the address of the current instruction. Each time an instruction is completed, the program counter is advanced by one memory location.
Each machine instruction is composed of two parts: the op-code and the operand. According to Brookshear [1997], "the bit pattern appearing in the op-code field indicates which of the elementary operations, such as STORE or JUMP, is requested by the instruction. The bit patterns found in the operand field field provide more detailed information about the operation specified by the op-code. http://courses.cs.vt.edu/csonline/MachineArchitecture/Lessons/CPU/instruction_format.gifFor example, in the case of a STORE operation, the information in the operand field indicates which register contains the data to be stored and which memory cell is to receive the data." The image to the right shows the format of an instruction for our CPU. The first three bits represent the op-code and the final six bits represent the operand. The middle bit distinguishes between operands that are memory addresses and operands that are numbers. When the bit is set to '1', the operand represents a number. A simple set of machine instructions for our CPU are listed in the table below. Notice that all the op-codes are given an English mnemonic to simplify programming. Together these mnemonics are called an assembly language. Programs written in assembly language must be converted to their binary representation before the CPU can understand them. This usually done by another program called an assembler, hence the name.

Op-code
Mnemonic
Function
Example
001
LOAD
Load the value of the operand into the Accumulator
LOAD 10
010
STORE
Store the value of the Accumulator at the address specified by the operand
STORE 8
011
ADD
Add the value of the operand to the Accumulator
ADD #5
100
SUB
Subtract the value of the operand from the Accumulator
SUB #1
101
EQUAL
If the value of the operand equals the value of the Accumulator, skip the next instruction
EQUAL #20
110
JUMP
Jump to a specified instruction by setting the Program Counter to the value of the operand
JUMP 6
111
HALT
Stop execution
HALT
A simple machine language
In the machine language above, notice that some of the operands include a # symbol. This symbol tells the CPU that the operand represents a number rather than a memory address. Thus, when the assembler translates an instruction with a # symbol, the resulting machine code will have a '1' in the position of the number bit. Also notice the central role that the Accumulator register plays. Nearly all the operations affect the value of this register since the Accumulator acts as a temporary memory location for storing calculations in progress. With our machine language defined, we are ready to take a look at some simple programs.
The first program is called Sum. This program adds the numbers stored in two memory locations. Mathematically, this program represents the formulas x = 2, y = 5, x + y = z where the variables x, y, and z correspond with the memory locations 13, 14, and 15 respectively. The instructions for the program are listed below. Read through the program, and then view the animation of this program by clicking the "View Animation" link.
#
Machine code
Assembly code
Description
0
001 1 000010
LOAD   #2
Load the value 2 into the Accumulator
1
010 0 001101
STORE  13
Store the value of the Accumulator in memory location 13
2
001 1 000101
LOAD   #5
Load the value 5 into the Accumulator
3
010 0 001110
STORE  14
Store the value of the Accumulator in memory location 14
4
001 0 001101
LOAD   13
Load the value of memory location 13 into the Accumulator
5
011 0 001110
ADD    14
Add the value of memory location 14 to the Accumulator
6
010 0 001111
STORE  15
Store the value of the Accumulator in memory location 15
7
111 0 000000
HALT     
Stop execution
Sum program   [view animation]
The second program is called Count. This program counts up to a number specified by the programmer in the first instruction. Notice that this program incorporates a loop construction by using the JUMP and EQUAL instructions. Every time the value in the Accumulator is incremented, the count is tested to see if it has reached the specified amount. Read through the program, and then view the animation of this program by clicking the "View Animation" link.
#
Machine code
Assembly code
Description
0
001 1 000101
LOAD   #5
These two operations set the count value to five
1
010 0 001111
STORE  15
2
001 1 000000
LOAD   #0
Initialize the count to zero
3
101 0 001111
EQUAL  15
Test to see if count is complete; if yes, skip next instruction and go to instruction 5; if no, go to next instruction
4
110 1 000110
JUMP   #6
Set Program Counter to 6
5
111 0 000000
HALT    
Stop execution
6
011 1 000001
ADD    #1
Increment the count in the Accumulator
7
110 1 000011
JUMP   #3
Set Program Count to 3
Count program   [view animation]
http://courses.cs.vt.edu/csonline/toolbar/line.gif
References
·         Brookshear, J. G. (1997), Computer Science: An Overview, Fifth Edition,
·         Addison-Wesley, Reading, MA.
·         Intel (2000), "Virtual press kit for 0.18 micron processor launch," http://developer.intel.com/pressroom/kits/events/18micron/photos.htm.


Instruction Execution Cycle


Execute instruction
Diagram showing the basics of the instruction execution cycle. Each instruction is fetched from memory, decoded, and then executed.
Once a program is in memory it has to be executed. To do this, each instruction must be looked at, decoded and acted upon in turn until the program is completed. This is achieved by the use of what is termed the 'instruction execution cycle', which is the cycle by which each instruction in turn is processed. However, to ensure that the execution proceeds smoothly, it is is also necessary to synchronise the activites of the processor.
To keep the events synchronised, the clock located within the CPU control unit is used. This produces regular pulses on the system bus at a specific frequency, so that each pulse is an equal time following the last. This clock pulse frequency is linked to the clock speed of the processor - the higher the clock speed, the shorter the time between pulses. Actions only occur when a pulse is detected, so that commands can be kept in time with each other across the whole computer unit.
The instruction execution cycle can be clearly divided into three different parts, which will now be looked at in more detail. For more on each part of the cycle click the relevant heading, or use the next arrow as before to proceed though each stage in order.
Fetch Cycle
The fetch cycle takes the address required from memory, stores it in the
instruction register, and moves the program counter on one so that it points to the next instruction.
Decode Cycle
Here, the control unit checks the instruction that is now stored within the
instruction register. It determines which opcode and addressing mode have been used, and as such what actions need to be carried out in order to execute the instruction in question.
Execute Cycle
The actual actions which occur during the execute cycle of an instruction depend on both the instruction itself, and the addressing mode specified to be used to access the data that may be required. However, four main groups of actions do exist, which are discussed in full later on.
Clicking the next arrow below will take you to further information relating to the fetch cycle.

http://www.eastaughs.fsnet.co.uk/cpu/execution-cycle.htm

No comments:

Post a Comment