Computer Organization-Revised For Slau

COMPUTER ORGANIZATION
OUTLINE 1. History of computers 2. Languages and language levels 3. Evolution of computer architecture 4. Processors 5. CPU organization 6. Memory classification 7. Micro- architecture level 8. Improving performance 9. Instruction set 10. Addressing modes 11. Flow control The History Of Computers The history of the computers is seen on mans attempt to create a machine which can store his work and memories when commanded to reproduce it. In the beginning computers were used for commercial purposes but they later moved to industries for the productive purposes and scientific research. The Early Computers There are eight different types of early computers namely; y Abacus y Napier bones y Slide rule y Blaise Pascal y Leibniz y Jacquards loom y Babbage y Holleith The abacus The invention of computers started 300 years ago. It was first used in china and later moved to Egypt, to Greece and later Rome respectively. The Abacus consisted of a rectangular wooden rack fitted into horizontal wires which had beads stung on them. When these beads were removed directing to some rule all regular alternating calculations would be made. Napier bones In 1614 John Napier developed a new set of logarithms based on multiplication tables and curved on wary slides. He was a mathematician from Scotland. The Napier bones consisted of bones with numbers painted on them. One could multiply or divide by sliding bones and this gave birth to the slide rule. Prepared by Dr. Mike Conrad 2012 Page 1
Slide rule In 1620 William oughted a British mathematician inverted the first analog computer which was referred to as the slide rule. This device could multiply and divide by addition and subtraction. The slide was made up of two sets of scales and these were marked in logarithm, whereby the distance between numbers got shorter as the numbers got larger by sliding its cursor.
Blaise Pascal He got the idea of inverting this type of computer as a way of helping his father who was a tax collector to compute the total. This kind of computer was the first mechanic adding machine. It was built in 1642. The numbers had to be punched into it using dials and it could then add or subtract as instructed. It had a toothed wheel, which were numbered from 0-9.when a wheel moved it pasted a small note, and caused the next wheel to rotate automatically.
Leibniz In 1671, Gottified Wilhelm leinbiz designed a computer that was leter built in 1694. It could add, subtract, multiply and compute square roots. This machine used a shift mechanism in a base ten system. Multiplying a digit by ten could shift it one place to the left, while dividing a digit by ten shifted it one place to the right. Jacquards loom Joseph Jacquard discovered a mechanism for controlling complex patterns. He used a system of purched cards with holes. The cards stored the instructions using the hole and no hole control method. Babbage In 1834, a British mathematician Charles Babbage had a vision for developing mechanical computers combined and program storage on one machine called the analytical engine. It had five logical units that had similarities to the present day computers. 1. The store: This unit had all the variables to be operated upon and the quantities (Output) which came out of the operation. 2. The Mill: It performed the Arithmetic computations. It was similar to the arithmetic unit of CPU in the modern computers. 3. The control: It contained the sequence of operations to be carried out. This function was operated by punched cards, which contained the program or instructions for executing a task at hand. 4. The input: This represented the system to feed data into the computer. 5. The output: It was used to retrieve the results of the operation. Note: Prepared by Dr. Mike Conrad 2012 Page 2
y y y y
The analytical engine used punched cards It used modern day computing principle such as cycles and loops in repeat action It had input and output devices that were operated by means of numbers directive and operational cards It also had a facility for evaluating operations and carrying out conditional transfers.
Holleith Dr. Herman Holleith worked on methods to convert information from punched cards into electrical impulses which could then activate mechanical counters. While at Massachusetts institute of technology, he achieved this and developed a machine which used punched cards to store and tabulate information. He called this new invention the Holleith electrical tabulating system. It was used in 1890 census when Holleith was a statician for the US census bureau. In the previous census, it took the bureau seven years to tabulate results, but with Holleith the 1890 census was completed in a period of a short time. Holleith established a company to market his invention and he called a computer tabulating company which later became International Business Machine-IBM the modern day computer manufacturer.
Computer generations Each generation is characterized by dramatic improvement over technology used to build the computer internal organization of the computer programming languages. (a) Zeroth generation (1642-1945): Blaise Pascal built a working calculating machine that could do only subtraction and addition. This was in France. Thirty years later Baron Gottfied built a mechanical machine that could multiply and divide as well. This was in German. (b) First generation (1945-1955): These machines used vacuum tubes for internal operations which were very bulky and caused tremendous heating problems. In this generation programming was done in machine language. This machine was introduced by Von Neumann. Machine language is a fundamental language of a computer and is normally written as strings of binary digits 1 and 0. This is the only language that a computer understands. Advantage It has fast execution speed Disadvantages It is machine dependent Prepared by Dr. Mike Conrad 2012 Page 3
It is difficult to program It is difficult to modify Note: this generation had limited primary memory and used punched cards for input and output. LANGUAGE AND LANGUAGE LEVEL A program is a set of instruction designed to perform a particular task with a computer. Programming language is a language used to compile and execute a program. These languages are divided into 3 categories; machine language, assembling language and high level language
HIGH LEVEL LANGUAGES (HLL) These are a set of languages that have been devised in a way close to native language. A program expressed in a HLL has to be translated into machine language before the computer can perform all operations specified by it. Examples include C, C++, Java, COBOL, Pascal, visual basic so on. ADVANTAGES OF HIGH LEVEL LANGUAGES + Machine independent + Easier to learn + Easier to maintain MACHINE LANGUAGE This is a fundamental language of a computer and is normally written as strings of binary 1s and 0s. This is the only language understood by the computer ADVANTAGES - Execution speed is fast
DISADVANTAGES - Machine dependant - Difficult to program - Difficult to modify Assembly languages are low level languages after machine language next to high level language. COMPILERS AND INTERPRETERS
Prepared by Dr. Mike Conrad 2012 Page 4
A compiler is a translator which translates a program written by a user in HLL into machine instructions understood by a computer. It translates a program as a whole. While an interpreter translates a program written by a user in a HLL into machine instruction understood by a computer. It translates a line at a time into machine instruction which is immediately executed. SIX LEVEL COMPUTERS Level 5 Problem oriented language level Compiler Level 4 KAMPA Assembler Level 3 Operating System machine level Operating system Level 2 Instruction set architecture level interpretation Level 1 Micro-architecture level Assembly language level
hardware Level 0 Digital logic level
DIGITAL LOGIC LEVEL This consists of registers that can hold binary data.
MICRO ARCHITECTURE LEVEL It is a collection of registers and a circuit called ALU. ALU is capable of performing simple arithmetic operation. The registers are connected to ALU to form data path through which data flows. LEVEL 2 It describes the machines instruction set. LEVEL 3 Prepared by Dr. Mike Conrad 2012 Page 5
The Operating System acts at this level. It provides an interface between HLL and the hardware. LEVEL 4 This is the assembly language level. It enables people to write program in assembly language and the programs are translated by a program called an assembler. LEVEL 5 Consists of languages designed to be used by application programmers. Such languages are called high level languages. Programs written in a HLL are generally translated to level 3 or 4 by translators known as compilers or interpreters. EVOLUTION OF COMPUTER ARCHITECTURE i. ZEROTH GENERATION : (1642 - 1945) Blaise Pascal built a working calculating machine that could do only subtraction and addition. This was in France. Thirty years later Baron Gottfied built a mechanical machine that could multiply and divide as well. This was in German. ii. FIRST GENERATION (1945 - 1955) These machines used vacuum tubes which were very bulky and caused tremendous heating problems. In this generation programming was done in machine language. This was introduced by von Newman
VON NEWMANN MACHINES
Memory
Control unit
ALU
Input
Output
Accumulator MEMORY This consisted of 4096 words and each word holding 40 bits each a 0 or a 1 ALU This performs mathematical calculations. Prepared by Dr. Mike Conrad 2012 Page 6
ACCUMULATOR Helped instructions to add a word of memory to the accumulator or store the contents of accumulator in memory. It was used for exchange purposes. These machines didnt have floating point arithmetic (no decimal calculations). EXAMPLES OF 1ST GENERATION i. ENIAC- Electronic Numerical Integrator And Computer ii. EDVAC- Electronic Discrete Variable Automatic Computer 2ND GENERATION (1955- 1965) These used transistors, the performance time was improved, and programming was done using assembly language. CPU Memory (Monitor) Console terminal (Monitor) Console terminal Other 1/0
Omnibus The machine could read and write magnetic tapes as well as print output. Examples include IBM 1400 series, 7094, 660. 3RD GENERATION (1965- 1980) They used integrated circuits, improved auxiliary, storage devices like magnetic tapes and magnetic disks were introduced. Programming was done in high level language like FORTRAN, COBOL, C and C++. Computers were capable of multi programming (having several programs in memory at once so that when one program is waiting for I/O to complete, another program can execute). Examples are IBM 360 series. 4TH GENERATION (1980 - ?) They use chips and several other features were introduced on the software side. Operating systems secure powerful and provided multi programming, multi processing and data communication facilities. A wide variety of specialized input/ output devices were introduced to cater for the specific needs of users.
THE PROCESSOR TYPES INTEL 80486 (1989-1994) It has 1.2 million transistors, 29 registers, 32 bit microprocessor called the 80486 DX. The 486 DX provides full 32 addressing for access to 4 GB of physical RAM and up to Prepared by Dr. Mike Conrad 2012 Page 7
64 TB of virtual memory. It offers twice the performance of the 386 DX. It uses pipeline and also adds 8kb of cache memory right on the chip. If the needed piece of data is in the cache, the CPU need not waste time looking for it. It also includes the floating point unit in the CPU itself rather than requiring a separate co- processor chip. This is not time for all 486 family. If (486DX) is offered in 5 volts and 3 volt version 3v for laptops and other low mobile computing applications. It is upgradeable- calling for future CPU using a faster internal clock to be inserted into the existing system other than purchasing a new one. Intel dubbed this over drive technology. Since not all 486 are upgraded the CPU socket on the motherboard its self must be designed specifically to accept an over drive CPU. In 1991 80486SX and 80486DX/50 were released. They both offered 32 bit addressing, 32 bit data path and 8kb of chip cache memory.486SX takes a smaller step backward from 486DX by removing the math co-processor and offering slower versions at 16, 20, 25, 33 MHZ. it later on subsistent improved and included more chips like 80486 DX2/66. The 2 indicates that the chip is using an internal clock that is double the frequency of the system. By multiple factor 66 is the internal speed of the CPU. PENTIUM (1993-1998) The 486 series had become well entrenched in everyday desktop computing by 1992 but thereafter Intel changed to Pentiums due to legal conflicts regarding trade marking. In 1993 the 3.21 million transistor Pentium co-processor was introduced (dubbed p5 or p54 series). It retained the 32 bit address bus width of the 486 family. With 32 bit it can directly address 4GB of RAM and can access up to 64 TB of virtual memory. The 64 bit external data bus width can handle twice the data throughput of the 486s. At 60 HZ the Pentium performs at 100MIPS and 66 MHZ yields 111.6 MIPS (twice the processing power of the 486DX/66). All versions of Pentium include an onboard math coprocessor and are intended to be compatible with future overdrive designs. Original Pentium uses two 8- Kb caches- one for instructions and another for data (16KB total). A dual pipeline technique allows the Pentium to actually work on more than one instruction per clock cycle. It also includes on board power management features (similar to the 486 SL line), allowing it to be used effectively in portable computers. Early Pentiums started at 5volts but all models starting at approximately 100MHZ (P54c) use 3.3v or less. Finally, the Pentium is fully backward compatible with all software written for the 8086 /8088 and later CPUs. Intel has released up to 200HZ versions of Pentium. Faster versions are unlikely because of more powerful processors such as Pentium MMX, Pentium pro, Pentium II/III (and the Pentium 4 line). Pentium versions have proliferated so much but by use of S-specification rating (the engineering revision level ) marked on each Pentium or Pentium MMX processor to reveal key operating characteristics of a particular CPU, you can tell whether the motherboard is configured properly for a given CPU keys like;1. ES engineering sample 2. DP dual processor 3. Mobile - meant for mobile operations 4. VR voltage reduced Prepared by Dr. Mike Conrad 2012 Page 8
5. VRE 6. VRT 7. STD -
CPU uses 3.4-3.60 CPU uses split voltage (2.8v/3.3v) standard part using normal timing and .135 3.6v
PENTIUM PRO Those other versions could handle the then trafficking there came a need to cater for emerging operating systems and the business system. Pentium pro (p6 or Pentium Pro dubbed) evolved as an optimizer for such end desktop workstation and network servers. Pentium 6 processors range from 150 MHZ to 200 MHZ and can handle multiprocessing in systems up to 4 CPUs. Pentium VI uses dynamic execution to improve its performance and employs 2 separate 8 KB level 1 (L1) caches. It uses up to IMB of onboard level 2 (L2) cache. This maximizes Pentium VIs performance without relying on the motherboard to supply L2 cache. By using s- specifications you can tell how to configure Pentium Pentium VI. Note 1. These L2 cache refers to the silicon revision of the 25 KB or 512 KB on chip L2 cache 2. These are engineering samples only 3. The VID pins are functional but not tested on these parts. 4. VID pins are not supported on these parts 5. These sample parts are equipped with a pre- production 512 KB L2 cache. CURRENT = 11.9a Primary volt = 3.5v + 5% etc Note that dynamic execution uses multiple branch prediction to predict the flow of the program through several branches (accelerating the flow of work to the processor). A data flow analysis then creates an optimized (reorder) schedule of instruction by analyzing the relationship between instructions. And finally, speculative executive carries out the instructions speculatively (assuming the execution order to be correct) on the basis of this optimized schedule. Dynamic execution keeps the processors superscalar execution engines busy and boots overall performance. Generally Pentium II (1997- current) is a combination of the best features of both the software performance of the Pentium pro and Pentium MMX. It also employs the dynamic execution. Two Pentium II over drive processors have been produced for upgrading Pentium pro (socket 8) processors. One replaces the 150 180 HMZ Pentium pros (60 MHZ bus speed) and provide a performance increase to 300 MHZ. The other overdrive replaces the 166-200 MHZ (66 MHZ bus speed) Pentium pro processors and increase performance to 333 MHZ . the integrated on die L2 cache design of the 8 socket package style also provide a performance increase by allowing the L2 cache to operate at a full core speed. The over drives are generally rare today
THE AMD CPU Prepared by Dr. Mike Conrad 2012 Page 9
American micro devices has become a single biggest competitor of Intel. It is known for providing well designed and highly compatible alternative processors to the pc industries. It is actually pushing a bit ahead in terms of processor performance and operating speeds in some bench marks. The AMD 486 series was AMDs answer to Intels 486 clocks doubling and tripling over drive processors. They incorporate write back cache and enhanced power management features including 3v operation, system management mode (5mm) and clock control (appealing for energy star- complaint green desktop systems and portable PCs). It is available in AM486DX4/75/10/120. Others include AM5X86, K5 series, and K6 series, K6-2 and K6-3, Athol, Devon VIA/CYRIX CPUs It was also competing against AMD. It released 486SLC and later 486D X 4, later Cyrix 5x86 which competed seriously AMD 5X86. Though it tried to establish itself behind Intel and AMD it was unable to overcome the technology and performance gap that played some of its later processor offerings. In 1999 Cyrix name, asset etc were purchased by VIA technologies. It also purchased centaur (Established in 1995) this gave via important access to the market. The development of the centaur core made a foundation to choose the actual public release of the Cyrix III processor 6x86 series (1995-1999) [doubled the MI- later version later called MIR] as an answer to Intel Pentium. - Optimized for both 16 bit and 32 bit software. - Super 7 form factor - Super pipe lined integer unit and on board float point unit (FPU) - 16- KB unified back cache - Uses p- rating - 3 million transistors.
ORGANIZATION OF A SIMPLE COMPUTER
(CPU)
Control Unit
Arithmetic Logic Unit I/0 devices
REGISTERS Main Memory Disk Printer
BUS A digital computer consists of a system of interconnected processors, memories and I/O devices as seen above. Central Processing Unit: This is the main brain of the computer. Its function is to execute programs stored in main memory by fetching their instructions, examining them and then executing them one after the other. The components are connected by a bus. A bus is a collection of parallel wires for transmitting data and control signals. Buses can be external to the CPU connecting it to the memory and I/O devices; also internal to the CPU. The CPU is made of several distinct parts as listed below; CONTROL UNIT: It is a responsible for fetching instructions from main memory and determining their type. 2. ALU: performs operations such as addition, subtraction, multiplication, division and Boolean. Prepared by Dr. Mike Conrad 2012 Page 11
1.
3. MEMORY: It is used to store temporary results and certain control information. The memory is made of a number of registers and the most important registers are (i) program counter. This point to the next instruction to be fetched for execution. (ii) Instruction registers (IR) - holds the instruction that is currently being executed.
CPU ORGANIZATION DATA PATH OF A TYPICAL VON NEUMANN CPU
A+B A Registers B
ALU input register A B ALU input bus
ALU
A+B
ALU output register
The data path consists of registers (typically 1- 32), the ALU and several buses connecting the pieces. The registers feed into two ALU input registers labeled A and B. These registers hold the ALU input while ALU is computing. The ALU itself performs addition, subtraction and other simple operations on its inputs thus yielding a result in the output register. This output can be stored back into a register and later alone the register can be written into main memory. INSTRUCTION EXECUTION The CPU executes each instruction in a series of small steps as follows. 1. Fetch the next instruction from main memory into instruction register. 2. change the program counter to point to the following instructions 3. Determine the type of instruction just fetched. 4. If the instruction uses a word in memory determine where it is. 5. fetch the word, if needed , into the CPU register 6. execute the instruction 7. Go to step 1 to begin executing the next instruction. The above sequence of steps is frequently referred to as fetch decode execution cycle. INSTRUCTION LEVEL PARALLELISM This is execution of two or more instructions at once. In order for the CPU to execute two or more at once, it uses a method called PIPELINING. This involves dividing an instruction execution into many parts each one handled by a dedicated piece of hardware all of which can run in parallel. A FIVE STAGE PIPELEINE S1 Instruction fetch unit S2 Instruction decode unit S3 Operand fetch unit S4 Instruction execution unit
S5 Write back unit
The figure above illustrates a pipeline with 5 units (stages) needed.
Stage 1. Fetch instructions from memory and place in a buffer until it is needed Stage 2. Decode the instruction determining its type and what operands its needs it. Stage 3. Locates and fetches operands either from registers or from memory Stage 4 Does the work of carrying out the instruction typically by running the operands through data path Prepared by Dr. Mike Conrad 2012 Page 13
Stage 5. Writes the result back to the paper register SUPER SCALAR ARCHITECTURES
S1 Instruction Fetch
S2 Instruction decode unit
S3 Operand fetch unit
S4 Instruction execution unit
S5 Write back unit
Unit
Instruction Decoded
Operand fetched
Instruction execution
Write back unit
The figure above shows a design of a dual pipeline CPU, where a simple instruction fetch unit fetches pairs of instructions together and puts each in its own pipeline. Pipelines were introduced in 486 Intel. It had one pipeline and Pentium had 2-5 stages (as in above)
PROCESSOR LEVEL PARALLELISM Multi processors CPU CPU CPU CPU Shared Memory
A single bus multi processor This is a system with more than one CPU sharing a common memory. Since each CPU can read or write any part of memory, they must co-ordinate to avoid getting in each others way. A multi processor can be implemented to have a single bus with multiple CPUs and one memory all plugged into it as above. The above multiprocessor may have conflicts when a large number of faster processors constantly tries to access memory over the same bus. To reduce this contention and improve performance the design below can be considered. Prepared by Dr. Mike Conrad 2012 Page 14
Local memory
Local memory
Local memory
Local memory
CPU
CPU
CPU
CPU
Shared Memory
BUS Each processor is given a local memory of its own not accessible to the others. This memory can be used for program code and all these data items that need not be shared. Access to this private memory does not use the main bus, greatly reducing bus traffic. MULTI COMPUTERS These are systems that consist of a large number of inter-connected computers each having its own private memory but no common memory. The CPUs in multi computer communicates by sending each other message. Messages from one computer to another often must pass through one or more intermediate computers or switches to get from source to destination. MEMORIES CLASSIFICATION I. Primary memory (main memory (RAM & ROM) ) II. Secondary Memory PRIMARY MEMORY Memory is that part of a computer where programs and data are stored. It is the storage from which processors can read / write information. The basic unit of memory is the binary digit (bit). A bit may contain a 0 or a 1 MEMORY ADDRESSES Memories consist of a number of cells (location) each of which can store a piece of information. Each cell has a number called its addresses by which programs can refer to it. All cells in memory contain the same number of bits. The number of bits per cell for some historically commercial computers is 1, 8, 12, 16, 18, 24, 27, 2, 36, 48, 60, 64, 24, 256. An 8 bit cell is called a byte CACHE MEMORY The processing speed of a CPU is much faster than the speed at which the memory (main) is able to provide information to it. As a result when the CPU issues a memory Prepared by Dr. Mike Conrad 2012 Page 15
request it will not get the information it needs for many CPU cycles. A cache memory is a small, fast memory that is able to provide operands to the processor at the speed it can process them. The most heavily used words (data) are kept in the cache when a CPU needs a word it first looks in the cache, only if the word is not there in the cache that the CPU goes to main memory. The cache is logically between the CPU and main memory.
CPU
Cache
Main memory
BUS SECONDARY MEMORY This is the external memory used to store large data amount. Example of secondary memory include floppy disks, CD- ROMS, DVDS (digital versatile disk) used to record graphics, movies, Compact disk ROM (CD R- recordable, CD RW rewriteable) Recordable are written once and for all i.e. not deleted, while Recordable is erased in case you want so. They store in 756 MB range. INPUT / OUTPUT TERMINALS Computer terminals consist of the following parts; key board, monitors, mouse etc. KEY BOARD On PCs when a key is depressed an interrupt handler (piece of software that is part of operating system) is started. The interrupt handler reads the hardware register inside the key controller to get the number of the key that was just depressed. When a key is released a second interrupt is caused.
Cathode Ray Tube (CRT) MONITORS Electron gun
Grid
Screen
Spot on screen Vacuum
Vertical deflection plate A monitor is a box containing a CRT (Cathode Ray Tube and its power supplies). The CRT contains a gun that can shoot an electron beam against a florescent screen near the front of the tube as shown above. Color monitors have 3 electronic guns each for red, green and blue. To produce a pattern of dots on the screen, a grid is present inside the CRT. When a positive voltage is applied to the grid, the electrons are accelerated causing the beam to hit the screen causing it to glow. When a negative voltage is applied to grid, the electrons are repelled, so they dont pass their grid thus the screen does glow. PRINTERS (i) MATRIX PRINTER
In dot matrix printer the print head contains seven and twenty four electromagnetically activatable needles scanned across each print line. The figure above illustrates letter A printed using seven needless. The print quality can be increased by increasing the number of needles and having the circles overlapping. The figure below shows letter A printed using 24 needles that produce overlapping dots.
Usually the print head makes multiple passes over each scan line in order to produce overlapping dots. Therefore increased quality printing goes hand in hand with slower printing rates. Matrix printers are cheap and highly reliable but slow, noisy and poor at graphics. They have 3 main uses in systems (i) Popular for printing on large preprinted forms (filling already typed transcripts by including names and marks). (ii) They are good at printing on small pieces of paper such as cash registers, receipts, ATM machine and credit transaction slips, air line boarding, passes etc. (iii) For printing on multi part continuous forms with Carbon paper embedded between the copies. INK JET PRINTERS Ink jet printers have a movable print head which holds an ink cartridge. The head is swept horizontally across the paper while ink is sprayed from its tiny nozzles. Inside each nozzle an ink droplet is electrically heated to the boiling point until it explodes. The only place the ink can go out of the front of the nozzle onto the paper. A nozzle is then cooled and the resulting vacuum sucks in another droplet of ink. The speed of an ink jet printer is limited by how fast the boil- cool cycle can be repeated. Ink jet printers are low cost home printers, quiet and have good quality. However, they use expensive ink cartridges and produce ink socked output (a lot of ink into paper and takes long to dry up). LASER PRINTER A laser printer has a rotating precision drum. At the start of each print page cycle it is charged up to 1000 volts and coated with a photo sensitive material. The light from a laser is scanned along the length of the drum much like the electron beam in CRT. Laser Prepared by Dr. Mike Conrad 2012 Page 18
printers are expensive business printers quiet and produce good quality graphics. They also use expensive ink cartridges. INSTRUCTION EXECUTION Executing an instruction may require locating operands in memory reading them and storing results back into memory. INSTRUCTION FORMAT A machine instruction has several fields. the instructions format is the size and arrangement of these fields. The two major fields of an instruction are; (i) Opcode (operation code) which specifies the operation to be performed and (ii) Operand addresses which specifies the location of operands used. On any given machine there may be several instructing lengths and even greater variation in instruction format INSTRUCTION ADDRESS FROMAT Address format is part of the instruction format which deals with specifying the addresses of the operands. Several different methods may be used as follows; (i)Three address format Each instruction specifies the addresses of two operands and gives a further address for the result of operation as shown below;
OpCode
Address
Address 2
Address 3
SUM
ii) Two address format; Each instruction specifies the address of the two operands, the result of an operation would replace one of the two operands. This is an improvement of the 3- address format in terms of storage space. Opcode Address Address
(iii) one and half address format One operand is held in a special register or accumulator having first been fetched and placed there. The instruction specifies the address of the other operand and the number of the accumulator. This gives a further shortening of instructions since fewer bits are needed to specify the accumulator than are needed to specify an operand address. Prepared by Dr. Mike Conrad 2012 Page 19
Opcode
Accumulator
Address
iv) Between accumulators
Opcode
Accumulator
Accumulator
A 11/2 address format may be used to fetch or load operands into the accumulator from main store but there after instruction may just specify the number of accumulators that are to be used for operands and the result. TYPES OF INSTRUCTIONS There are many ways of grouping or classifying a member of an instruction set. The following are the basic classification of instruction set i. Arithmetic and logic operations ii. Transfer of control or branch instructions iii. Load, fetch and store instruction iv. I/P, O/P instruction v. Memory reference instruction vi. Processor reference instructions
(i) ARITHMETIC AND LOGIC OPERATION These are instruction which use the operation of the ALU. They include the following arithmetic operations (+, -, *, ), incremental / decrement operations (++ and --), / increasing / decreasing content of an address/ accumulator by 1. Negation and complemeting (produce ones or twos complements). Arithmetic shift (moving bits in a register either left or right in order to multiply or divide). For example
Before 0
After shifting bits to left 0 1 1 0 0 1 0
LOGICAL OPERATIONS Boolean operation (AND, OR) 2. TRANSFER OF CONTROL AND BRANCH INSTRUCTIONS These are instructions which change the sequence in which instructions are obeyed. Their execution causes a jump to an instruction given elsewhere. The two main types of these jump instructions are; i) unconditional transfer control These are jumps which are either instruction to jump to a specific address or skip to the next instruction. ii) Conditional These jumps will only occur if the result of an operation has particular value e.g. jump to a specific address if that accumulator contains a zero, skip the next instruction if the accumulator contains zero. 3. LOAD FETCH AND STORE These instructions cause the transfer of data between accumulator and memory e.g. load contents of a specific address into a specified accumulator. 4 INPUT/ OUTPUT These instructions implement transfer of data between peripherals and memory or between peripherals and accumulators e.g. checking to see if a terminal key board has data ready for transfer moving character from the Keyboard into the accumulator. 5. MEMORY REFEERENCE These are operations which require access to memory during their execution. 6. PROCESSOR REFERENCE These dont require memory and dont involve in/ out put (e.g. processor hault and reset)
INSTRUCTION EXECUTION starter Register ALU Control Unit S C R Accum Decode opcode +1
CIR MAR
MDR
MAIN MEMORY
Instructions are executed through two phases -fetching and executing. This process is called fetch- execute cycle. The diagram above explains the process of fetch execute cycle. FIRST PHASE: fetching, the required instructions. The following, steps are marked in fetching, the instruction to be executed. (1) The memory address register (MAR) is loaded with the address of instruction to be performed. This instruction is copied from the sequence control register (SCR). (2) The SCR is then increased by 1 so that it is ready to be referenced for the next fetch. (3) The fetch is completed by loading the instruction into the current instruction register (CIR) via the memory data register (MDR) EXECUTION OF THE INSTRUCTION The opcode or function part of the instruction is decoded (interpreted) by the control unit. The execution of the instruction depends upon the type of instruction. i) INSTRUCTIONS THAT DO NOT REQUIRE MEMORY REFERENCES The control unit sends command signals to the appropriate (appropriate addresses) device in the desired sequence until the execution is complete. Prepared by Dr. Mike Conrad 2012 Page 22
ii) INSTRUCTIONS FOR TRANSFER OF CONTROL INSTRUCTIONS These are instructions that cannot be executed in sequence e.g. (if, else controls). A conditional transfer instruction will only lead to the SCR being over written if the specified condition is met. A special starter register is altered by the ALU when a specific condition occurs and the control unit uses the content of starter register to determine what control action is to be performed next. iii) INSTRUCTION FOR LOAD AND STORE The appropriate operand address will be copied from the CIR into MAR. Data will then pass from the specific storage location into the accumulator via MDR in the case of load or the reverse direction in the case of store. iv) INSTRUCTION S REQUIRING MEMORY REFERENCE The appropriate address will be copied from CIR to MAR. Data will then pass into the ALU from the specified locations via the MDR. The ALU will receive the data to add it to the content of the accumulator which may be replaced by the result. IMAGINARY MACHINE LANGUAGE Remembering the binary codes of each instruction or for each function is troublesome. The following aids are often used by programmes. i) Symbolic and mnemonic codes Here, groups of letters/ symbols are used by programmers when writing programs. These codes are usually mnemonic which means that their sounds suggest their meaning. For example LDA may stand for load accumulator, LDA N means load into accumulator. I.e. load the accumulator with the content of address N ii) Binary codes These may be converted to octal or hexadecimal to make them more manageable or rememberable. iii) A specific program called assembler This may be provided which will convert a program in symbolic and mnemonic code into machine code. The mnemonic and symbolic program is called the source program for the assembler and machine code produced by assembler is called the object program. The number of bits in an instruction which are allowed for function code will determine the possible codes. If two bits are allowed then four function codes will be possible.
ADRESSING MODE While the instruction is executing there are various methods that may be used to locate the addresses of the operands and their contents. The main methods of addressing are; i. IMMEDIATE ADDRESSING ii. REGISTER ADDRESSING iii. REGISTER INDIRECT ADDRESSING iv. INDEXED ADRESSING IMMEDIATE ADDRESSING The address part of the instruction contains the value of the operand itself rather than an address or other information describing where the operand is. Such an operand is called an immediate operand because it is automatically fetched from memory at the same time the instruction itself is fetched. Hence it is immediately available for use. This only works for constant variables i.e. exact figures/ operands. REGISTER ADDRESSING The compiler determines which variables will be accessed most often (for example a loop index) and put these variables in registers instead of a memory location. Registers have fast accesses and short addresses. This addressing system is the most common one on most computers. REGISTER INDIRECT ADDRESSING In this mode the operand being specified comes from memory or goes to memory but its addresses is not hardwired with instructions as in immediate addressing. Instead the address is contained in a register. When an address is used in this way it is called a pointer. The advantage of this addressing mode is that it can reference memory without paying the price of having a full memory address with instruction. INDEXED ADRESSING In this mode, the required address is obtained by adding the contents of the address part of the instruction to the number stored in a special register called the index register. The need to do addition makes this method slower but once the address is calculated it can be accessed in one step i.e. immediately. INTERRUPTS These are changes in the flow of control caused by other modules i.e. input / output and memory to interrupt the normal processing of the CPU. Interrupts are provided primarily as a way to improve processing efficiency. For example most external devices are much slower than the processor. Suppose the processor is transferring data to the printer, after each write the processor will have to pose and remain idol until the printer catches up. The length of this pose can be used to execute thousands of other instructions that do not involve memory. Clearly this is a wasteful use of the processor. With interrupts, the processor can be interrupted to engage itself in executing other instructions while an I/O operation is in progress. TYPES OF INTERRUPTS Prepared by Dr. Mike Conrad 2012 Page 24
1. PROGRAM INTERRUPTS These are generated by some condition that occurs as a result of an instruction execution. For example arithmetic overflow, division by zero, a reference outside a users allowed memory space etc 2. TIMER INTERRUPTS These are generated by the timer within the processor. This allows the operating system to perform certain functions on a regular basis. (sets antivirus to scan drives at certain days intervals, disk cleaning is also implemented similarly. i.e. after 7 days scan disks, clean them). I/O INTTERRUPTS These are generated by an I/O controller to signal normal completion of an operation or to signal a variety of error conditions. HARDWARE FAILURE These are generated by failure such as power failure, memory address disorganization etc. INTERRUPTS AND THE INSTRUCTION CYCLE With interrupts the processor can be engaged in executing other instructions while an I/O operation is in progress. The users program reaches a point at which it makes a system call in the form of write call. The I/O program is then invoked to perform the write operation; after a few instructions have been executed by the CPU control returns to the user program. Meanwhile the external device is busy accepting data from the computer memory and printing it, the I/O operation is conducted concurrently with the execution of instruction in the user program. When the external device becomes ready to accept more data from the processor the I/O module for the external device sends an interrupt request signal to the processor. The processor responds by suspending operation of the current program branching off to a program to service a particular I/O device known as an interrupt handler and resuming the original execution after the device is serviced. The user does not need to have any special code to accommodate interrupts. The processor and the operating system are responsible for suspending the user program and then resuming it at the same point.
INSTRUCTION WITH INTERRUPTS Start
Fetch cycle
Fetch next instruction
Execute cycle
Execute instruction
Halt
Interrupt cycle
Check for interrupt process interrupt
To accommodate interrupts, an interrupt cycle is added to the instruction cycle as shown above. In the interrupt cycle the processor checks to see if any interrupt has occurred. This is indicated by the presence of an interrupt signal. If no interrupt appending the processor proceeds to the fetch cycle and fetches the next instruction of the current program. If an interrupt is pending the processor does the following a) It suspends execution of current program being executed and saves its contents. This means that saving the address, if the next instruction to be executed and any other data relevant to the processors current activity. (I.e. the current execution is incomplete, the processor will have to come and complete it). b) It sets the program counter to the starting address of an interrupt handler routine (IHR). The processor now proceeds to the fetch cycle and fetches the 1st instruction of the interrupt handler program which will service the interrupt. The interrupt handler program is generally part of the operating system. Typically this program determines the nature of the interrupt and performs whatever actions are needed. INPUT / OUTPUT FUNCTIONS An I/O module can exchange data directly with the CPU just as the CPU can initiate the read / write operation with the memory designating the address of specific location. The CPU can also read data from or write data to other I/O Module The CPU identifies a specific device that is controllable by a particular I/O module. This instruction sequence can be executed directly from the CPU. In some cases the I/O Prepared by Dr. Mike Conrad 2012 Page 26
exchange directly with memory in such a case the CPU without trying so that I/O transfers occur without tiring up the CPU. During such a transfer I/O module issues read /write commands to memory relieving the CPU from responsibility for exchange. This operation is known as Direct Memory Access (DMA) INTERCONNECTION STRUCTURES Memory N words 0 write Address Data Data 1
read
N- 1
I/O Module Internal Data
M- Ports Internal Data
External Data
External Data
Interrupt signal
*Internal data- to the computer systems e.g. from the CPU, memory, I/O * External data- outside the computer system e.g. e-mail, news Prepared by Dr. Mike Conrad 2012 Page 27
Instruction Control signal Data CPU
Internal signal Data
A computer consists of a set of components or modules of the basic types i.e. CPU, memory and the I/0 that communicate with each other. In effect a computer is a network of basic modules thus there must be a path for connecting modules together. The collection of paths connecting various modules is called the interconnection structure. The design of this structure will depend on the exchanges that must be made between modules. The major forms of I/O for each module are as shown above. The figure suggests the type of exchange needed by indicating the major form of I/O for each module type. MEMORY Typically a memory will consist of N words of equal length. Each word is assigned a unique numerical address (0, 1..N- 1). A word of data can be read from or written into memory. The operational nature is indicated by write/ read control signals. The location for operation is specified by an address. I/O MODULE From an internal device to the CPU point of view I/O is functionally similar to memory. There are two operations, read and write. Further, an I/O module may control more than 1 external device. We can refer to each address of the interface to the external device as a port and give each point a unique address for example 0, 1 m- 1. In addition, there are external data paths for the I/O of data with an external device. Also, input / output may be able to send interrupt signals to the CPU
The Central Processing Unit (CPU) Prepared by Dr. Mike Conrad 2012 Page 28
The CPU reads in instructions and data, write out data after processing and uses control signals to control the overall operation of a system. It also receives interrupt signal. In general the interconnection structure must support the following types of transfer; a) Memory to CPU; CPU reads an instruction or unit of data from memory. b) CPU to memory; CPU writes a unit of data to memory. c) I/O to from memory; I/O module is allowed to exchange data directly without going through the CPU using Direct Memory Address.
BUS INTERCONNECTION A bus is a communication path way connecting two or more devices. A key characteristic of a bus is that it is a shared transmission medium. A multiple device connects to a bus and a signal transmitted by any device is available for reception by all other devices attached to a bus. If two devices transmit during the same period, the signal will overlap and become distorted, thus only one device can be allowed to successfully transmit at a time. In many cases a bus actually consists of multiple communication pathways (lines). Each line is capable of transmitting signals representing binary 1 and 0. Over a time a sequence of bits can be transmitted across a single line. Taken together, several lines of a bus can be used to transmit bits simultaneously (parallel). For example an 8 unit of data can be transmitted over 8- bus line. Computer systems contain a number of different buses that provide pathway between components at various levels of computer between interconnection structures are based on use of 1 or more systems bus. BUS STRUCTURES A system bus consists of 50 100 separate lines. Each line is assigned a particular meaning or function. Although there are many different bus designs on any bus, the lines can be classified into 3 functional groups. These are Data, Addresses and Control Line. In addition there may be power distribution lines that supply power to the attached modules.
CPU
Memory
Memory
I/O
Control bus
Address bus
Data bus DATA LINE It provides a path for moving data between system modules. These lines collectively are called data path. The data bus typically consists of 8, 6 or 32 separate lines; the number of lines being referred to as the width of the bus (path) and the number of lines determine how many bits is transferable at a time and thus the width of the system. For example if a data bus is 8 bits and each instruction is 16 bits long, then the CPU must access the memory module twice during each instruction cycle. The operation of the bus is as follows. If one module wishes to send data to another it must do two things: i) Obtain the use of the bus ii) Transfer data via the bus If one module wishes to request data from another module it must first. i) Obtain use of the bus ii) Transfer request to another module over the appropriate control and address lines. It must then wait for the second module to send the data. THE ADDRESS LINES These are used to designate (locate) the source or destination of the data on the data bus. For example if a CPU wishes to read a word of data from the memory it puts the address of the desired words. The width of the address bus determines the maximum possible memory capacity of the system. This address line is unidirectional in that it follows one route only. THE CONTROL LINES Prepared by Dr. Mike Conrad 2012 Page 30
These are used to control the access and use of data and address lines. Since the data and the address lines are shared by all components, there must be a means of controlling their use. Control signals transmit both command and timing information between systems modules. Timing signals indicate the validity of data and addressing information. Command signals specify operations to be performed. Typical control lines include i) ii) iii) iv) v) vi) vii) viii) ix) x) xi) Memory write; it causes data on the bus to be written into the addressed location. Memory read; it causes data from addressed location to be placed on the bus. I/O read; it causes data from addressed I/O part to be placed on the bus. I/O write; it causes data between the bus to be output to the addressed I/O point. Transfer ACK; it indicates that data has been accepted from or placed on the bus. Bus request; it indicates that a module needs to gain access to the bus Bus grant; it indicates that a requesting module has been granted control of the bus. Interrupt request; it indicates that an interrupt is pending Interrupt acknowledgement; it acknowledges that the pending interrupt has been recognized. Clock; used to synchronize operations (place operation in order so that each is performed after the other). Reset; it initializes all modules
Physically the system bus is a number of parallel electrical conductors. These conductors are metal lines attached in a cord or board (printed circuit board). The bus extends across all of the system components each of which taps into some or all of the bus lines. A small computer system may be acquired and then expanded later. That is to say, more memory, more I/O, can be achieved by adding more boards or cards. If a component on a board fails the board can easily be removed and replaced. TYPES OF BUS SYSTEMS i) SINGLE BUS ii) MULTIPLE BUS
i)
SINGLE BUS
If a great number of devices are connected to a bus, performance will suffer. The two main causes of this are; a) The more devices attached to the bus, the greater the propagation delay. This delay determines the time it takes for devices to coordinate the use of the bus. Prepared by Dr. Mike Conrad 2012 Page 31
When control of the bus passes from one device to another frequently these propagation delays can noticeably affect performance. b) The bus may become a bottleneck as the data transfer demand approaches the capacity of the bus. This problem can be avoided to some extent by increasing the data rate bus from 32 to 64 bits. However since the data rates generated by attached devices (graphics, video) are growing rapidly, they cannot be transported by a simple bus. iii) MULTIPLE BUS Due to the development of computer systems to handle many devices, a single bus system could not match the development hence introduction of multiple bus system. Most computers today enjoy the use of multiple buses. Below is a description for the hierarchy of a multiple bus.
Processor Local bus
Cache
Main memory
Local I/O Controller
System Bus Small computer system interface Network Expansion bus interface Modem Serial
SCSI
Expansion bus Prepared by Dr. Mike Conrad 2012 Page 32
There is a local bus that connects the processor to the cache memory and it may support one or more local devices. The use of a cache structure insulates the processor from a requirement of frequently accessing main memory. Hence main memory can be moved off the local bus into a system bus. In this way I/O transfers to and from the main memory across the system bus dont interfere with the processor activities. It is possible to connect the I/O controller directly to the system bus. A more efficient solution is made to use one or more expansion buses for this purpose. An expansion bus interface suffers data between the system bus and the I/O controller on the expansion bus. This arrangement allows the system to support a wide variety of I/O devices and at the same time prevent memory to processor traffic from I/O traffic. The traditional bus design is reasonably efficient but begins to break down as higher and higher performance is seen in I/O devices. In response to these growing demands a common approach is taken by the industry to build a high speed bus that is closely integrated with the rest of the system requiring only a bridge between the processors bus and a high speed bus. MEZZANINE ARCHITECTURE (HIGH BUS DESIGN) Main memory
Processor
Local bus
Cache/ Bridge
SCSI
Graphic
Video
LAN
High speed bus
Fax
Expansion bus interface Modem Serial
Prepared by Dr. Mike Conrad 2012 Expansion bus Page 33
The high speed bus supports the connection of LANs (Local Area Network) at 100 Mega bits per second, video and graphics as well as work station. The arrangement was designed to handle high capacity I/O devices. Lower speed devices are still supported by the expansion bus with an interface suffering traffic between the expansion bus and the high speed- bus. The advantage of this arrangement is that the high speed bus brings high demand devices into closer integration with the processor and at the same time is independent of the processor.
ELEMENTS OF BUS DESIGN 1. Bus arbitration (centralized and distributed) 2. Bus types (dedicated and multiplexed) 3. Bus checking (synchronous and synchronous) 4. Bus width The high speed bus supports the connections of LANs such as FDDI (Fibre Distributed Data Interface) at 10MB Video and Graphics. The arrangement was designed to handle high capacity I/O devices. However, high speed devices are still supported on an expansion bus with an interface buffering the traffic between the expansion bus and the high speed bus brings high demand devices into closer integration with the processor and at the same time it is independent of the processor. ELEMENTS OF BUS DESIGN Although a variety of different bus design implementation exists, there are few basic paramount or design element that serves to classify and differentiate buses. 1. BUS TYPE Buses are classified according to type of the bus system used. Under this category, we have two types i.e. the dedicated or the multiplexed a) Dedicated bus Lines are permanently assigned either to one function or to physical set of computer components. An example of functional dedication is the use of separate dedicated address and data lines which is a common approach to many buses. Physical dedication refers to the use of multiple buses each of which connects on the sub set of modules. A typical example is the use of I/O buses to connect all the I/O modules. This bus is then connected to the main bus through the I/O adapter. The advantage of physical dedication is high output and high speed services because there is less connections and less propagation delay. The disadvantage is that it requires large space on the board hence large size and cost. b) Multiplexed bus A multiplexed bus is where address and data information may be transmitted on the same data lines using an address valid control line. At the beginning of data transfer, the address is passed on to the bus and address valid line is set at 1 or activated at this time. Each module or unit has a specific period of time to copy the address and determine if its Prepared by Dr. Mike Conrad 2012 Page 34
the addressed module. The address is then removed from the bus and the same bus connections are used for subsequent read / write data transfers. The advantage of a multiplexed bus type is the use of few lines which saves space and eventually reduces the cost. The disadvantage is that more circuits must be provided to control the synchronization of the bus use and the system of data transfer become complex. The performance of the computer will decline because all devices will be picking data and recognize the address which is time wastage and keeping devices busy for work which is not intended for. 2. METHOD OF ARBITRATION In all systems, more than 1 module may need control over the bus. For example an I/O module may need to read or write directly to the memory without sending the data to the CPU since only one unit at a time can successfully transmit over the bus. Some method of arbitration is needed. There are basically 2 methods of arbitration - the centralized and the distributed. a) CENTRALIZED ARBITRATION In centralized scheme, a single hardware device referred to as a bus controller or an arbitrator is responsible for allocating time on the bus. The device may be a separate module or part of the CPU.
CPU
MM
I/O Arbiter c) The Distributed Arbitration In the distributed scheme, there is no central controller. Each module or unit contains access control logic and each unit acts in accordance to what happens to other devices to share the bus. `` CPU Arbiter MM Arbiter I/O Arbiter
With both methods of arbitration, the purpose is to design one device either the CPU or I/O as a master. The master may then initiate the data transfer (e.g. read or write with some other devices which act as a slave for this particular exchange). 3. THE BUS WIDTH The bus can be classified according to size. The bus size is the number of wires in a single bus. The width of the data bus including the interrupt data lines all together determine the number of bits of data transferred at a time. The width of the address bus has an impact on the system capacity. The wider the address bus the greater the range of locations that can be addressed. The summation of all these bus widths gives a count of number of wires which are in the bus structure becoming the bus width. 4. DATA TRANSFER TYPE A bus supports various data transfer types (i.e. read modify- write). Read- write and block transfer operations. Buses support both write (master to slave) and read (slave to master) transfers. In the case of a multiplexed address or in the data bus the bus is used for specifying the address and for transferring the data. For example in the read operation there is a wait after write when the data is being fetched from the slave to be put on to the bus. For each read or write there may also be a delay if necessary to go through an arbitration to gain control of the bus for the remainder to the operation. Incase of dedicated address on the bus, the address is put on the bus and remain there while the data are put on the bus for the right operation. The master puts the data on the bus as soon as the data has been established and the slave recognizes its address. For a read operation, the slave puts the data on the data bus as soon as it recognizes its address and the master fetches the data. A read only modify write operation is simply a read followed immediately by a write to the slave address. The address is only broadcasted once at the beginning of the operation. The whole operation is individual in order to prevent access to other data elements by other potential bus master. Read after write operation consists of a write followed immediately by a read from the same address. The operation may be performed for checking purposes. Some bus systems also support block data transfer in this case one address cycle is followed by n- cycles. The first data item is transferred from or to a specific address. The remaining addresses are transferred to or a subsequent address.
LECTURE THE MOTHERBOARD It is the central board in a Personal Computer (PC) to which everything else ultimately connect. It is the physical and logical backbone of the entire system. And in some systems it is the whole computer and others have different circuit boards. The circuitry located on the mother board defines the computer (s), its capabilities, limitations, personality and performance. The motherboard holds the most vital electronic components that define the PC. These electronic components carryout most of the functions of the machine e.g. running programs, making calculations and so on. A motherboard contains the following; 1. Sockets/slots for the CPU and memory Sockets for extra Random Access Memory (RAM). Computer has a base/ starting memory of 640 kb but if its not enough motherboard allows you to slot in additional memory- RAM. These are Single Inline Memory Module (SIMM) and Direct Inline Memory Module (DIMM). 2. Expansion slots to add on the computer capability. Common slots include Peripheral Component Interconnection (PCI), Industry Stand Architecture (ISA), Enhanced ISA (EISA) and Micro Channel Architecture (MCA). MCA Never Worked Well that is why it is not seen today. 3. Ports on which devices like mouse, modem etc connect. Universal Serial Bus (USB) is a faster and wider connector. 4. Connectors- Integrated Drive Electronic (IDE) used to connect peripherals that carry vast data examples include; hard disk, Compact Disc-Read Only Memory (CD-ROM), Digital Versatile Disc (DVD). There are primary and secondary connectors. There are also power supply connectors like AT and XT. 5. Complementary Metal Oxide System/ Basic Input Output System (CMOS/BIOS) checks the basic parts of the system, for example is memory ok? Are the drives functioning well? 6. Buses are electronic pathways that carry information to different components. Interfaces are the glue that sticks everything together. A PC is a modular device because it consists of a number of standard components like video cards, disk drives (3/12 or 51/4 inch). This makes it easy for trouble shooting. Note (i)
Accelerate Graphic Port (AGP) is also an expansion slot for video and graphics ONLY. It has its own processor memory. System clock is also found at motherboard.
(ii)
USB and fire wire connect in a daisy chain that one wire can connect to a hub which has many more connectors or devices. Fire wire and USB differ by latter being faster. You are advised not to buy a system without USB MOTHERBOARD FORM FACTORS In simplest terms the form factor is little more than the dimension of the board and its monitoring hole position, as well as the general layout and placement of key components such as CPU, memory modules, expansion slots and Input/output (I/O) port. There are three major form factors to consider; AT, ATX and NLX. It is important for one to understand that form factors dont influence performance directly. For example a baby AT motherboard and an NLX motherboard can offer exactly the same performance characteristics. Form factor is most important in systems assembly and access for service or upgrading. AT style motherboard These are typically available in two variations: The baby AT and full AT which is larger. Both variations simply affect the overall dimensions of the motherboard. You can identify AT style based on three dimensions. First, look at the power connectors where power supply attaches. It uses two sets of 6- pin in line connectors usually designated p8 and p9. Second, the CPU is usually positioned in line with one or more of the ISA bus slots (almost always obstructing full length ISA cards). Third, the I/O ports, of an AT motherboard (such as COM ports, LPT ports, PS/2 ports, USB ports etc) are often spread out along the back panel of the chassis.
ATX style motherboards These are the result of the 1st serious industry push to standardize the dimension, device layout and connection schemes of a PC motherboard such as the Intel D850GB ATX socket 432 (Pentium IV) motherboards. As with an AT layout, an ATX motherboard is distinguished by 3 dimensions. (i) All I/O port connectors such as (USB, LPT, COM ports) are concentrated into a single I/O panel located at the rear of the motherboard. (ii) The ATX uses a 20- pin power connection from the power supply (and perhaps a supplemental 8- pin connection) (iii) The CPU is located clear and away from all expansion bus slots- eliminating any interference with full- slot expansion cards. ATX motherboards can be found supporting all current CPU types (socket / super 7, socket 360, sockets A , socket 432, slots, 1,2,A CPUs) NlX style motherboards While ATX motherboards represented a good effort at standardization, they will retain all the assembly problems of AT style motherboard- namely that the motherboard is cumbersome to install and time consuming to upgrade or replace. The NLX such as the Intel, JN440BX NLX, overcame this disadvantage by making the motherboard a replaceable (also referred to as dockable) device. Prepared by Dr. Mike Conrad 2012 Page 38
All expansion slots and connection headers (such as speaker connectors, power switch connector etc) are then moved to a riser card. The NLX motherboard its self then plugs into the riser card. Note the long card edge connector along the right side of the motherboard that interfaces with the riser. In this fashion, the motherboard can quickly and easily be removed from the system to change jumpers, add memory, or install a replacement motherboard. NOTE: The CPU is identified on a motherboard by being the largest chip amongst all other chips, with several pins. Be careful when placing it on the motherboard because any disturbance in the alignment of pins may jam out the processor. So fit it well in its socket as shown by the manufacturer(s). MEMORY ADDRESSABLE BY THE CPU (MEMORY-RAM) Memory addressable by the processor is the largest amount of memory a chip can use. All computers must have main memory. Main memory stores the programs and data that the computer is working on at the moment. The computer CPU needs to be able to access programs and data in nano second rather than mil seconds, as is the case with secondary memory like hard disk and CD- ROM. Main memory is also known as RAM meaning that data can be written to or read from it as long as power is still on. Virtual memory is the borrowed part of hard disk and turned into main memory. If a program is heavy or large and cant handle the operating system secures virtual memory and stores the remaining part of the program.
TYPES OF RAM This is how RAM is packaged. This occurs in three: DIP, SIMM and DIMM. If DIP has 8 chipsets their parity check is missing because if there are nine, the 9th is check parities. It only tells you the error has occurred but does not solve it. It instead hangs the PC. Some have improved by including Error Correcting Chip (ECC) that does both. RAM is commonly found in form of dynamic RAM and so DRAM chips are removed / fixed on the circuit board. The number of DRAM chips mounted on the side of circuit board varies upon the type of RAM. Sometimes there are eight or at times nine. With nine DRAMs one is a parity RAM (used for error testing). If DRAM chips are mounted on only one side of a memory module, then this RAM is packaged as a Single Inline Memory Module.
SIMMs come in 30 and 72 pin types. A 72 pin SIMM is able to perform more work across lower RAMs. If DRAM chips are mounted on both sides of a memory module then this RAM is packaged as Dual In line Memory Module (DIMM) which is a bit wider than SIMM Prepared by Dr. Mike Conrad 2012 Page 39
. . . 1 2 7 168 Pins DIMMs have 168 pins as opposed to SIMMs and most DIMMs are Synchronous DRAMs (SDRAM). SDRAM memory chips deliver data in very high speed vasts using a clocked interface. They support bus speeds of up to 100MHZ and more. MEMORY SPEEDS CPUs need to request data and programs from memory at speeds that memory can respond to. Because memory cannot cope up with CPU speed the International Business Machine (IBM) added to memory what is called wait- state. During a wait- state a processor suspends whatever it is doing for one or more clock cycles to give the memory circuit a chance to catch up. The number of wait states required in the system depends on the speeds of the CPU in relation to the speed of memory. With one wait state a PC operates at only two thirds (2/3) of its potential speed. While a 2- waitstates cut performance to . Memory chips dont connect directly to the CPUS address lines instead special circuits composing a memory controller translate the binary data sent to the memory address register into the form necessary to identify the memory identification/ location required and retrieve the data there. To read memory the microprocessor activate the address lines corresponding to the address code of the wanted memory during one clock cycles. This action acts as a request to the memory controller to find the needed data. During the next clock cycles the memory controller puts the bits of the code contained in the desired storage unit, on the CPU data bus. This operation takes two cycles because the memory controller cannot be sure the data is valid until the end of the next clock cycle. Consequently all memory operations take at least two clock cycles. When writing to memory, is similar way; the CPU first sends off the address to write to. The memory controller finds the proper location (look for free space). Then CPU sends out the data to be written. The minimum time required is also two cycles of the CPU clock. Memory chip speeds are rated by time in nanoseconds compared to CPU rated in Mega Hertz. These two measurements are reciprocal. At a speed of 1MHZ one clock cycle is 100 nanosec. At a speed of 8 MHZ one clock cycle is 125nanosec; if 16 MHZ one clock cycle will be 1/16 x 1000 nanosec; at 25 MHZ one clock cycle is 40nanosec and the CPU requires at least 2 clock cycles between memory operations. Therefore this gives a total Prepared by Dr. Mike Conrad 2012 Page 40 8
of 40 x 2 = 80 sec. It is possible to install a combination of quicker memory chips and slower ones onto the computer system. In general, there is no harm in installing quicker memory chips on a computer than what it calls for. That is; so long as you install quicker memory chips that the computer calls for; for example a 70 sec memory chip when computer needs 80 sec chip. If you install slower chips the most likely problem will be that they will not work as more likely they may work sporadically leaving the PC vulnerable to parity check errors at un expected time. The cycle time of the memory chip measures how quickly to- back access can be made to the chip. It is generally about 2-3 times the access time of the chip. SRAM chips have their cycle time equal to their access time and can therefore operate faster. They have speeds rating of 25 0r 35 nanoseconds while the fastest DRAM chip is rated at 60 or 70 nanoseconds. RAM TECHNOLOGIES To make memory faster it needs to make the memory chip faster. Therefore by reducing internal delays and taking advantage of the latest technology memory speeds can be improved significantly. To keep the number of connection low the address lines of most memory chips are multiplexed (i.e. the same set of lines serve for sending both row and column addresses to the chip). To distinguish whether the signal on address lines means a row or column chips use two signals - Row Address Stroke (RAS) which indicates the address in a row and the Column Address Stroke signal (CAS) indicating its column address. Note that the symbols RAS bar and CAS bar indicate that the signals are reverse indicate an on state when they are off. The memory controller first tells the memory chip the row in which to look for the memory cell then the column. In other words the address line accompanied by RAS bar signal selects a memory bank. A new set of signals on the address line accompanied by CAS bar signal selects the desired storage cell. This means to change all circuits in the chip from row to column addressing takes a substantial time in nanoseconds. Coupled with the need for refreshing it gives limitations to the performance of the memory chip. STATIC COLUMN RAM By redesigning the circuits in the chip these chips allowed reading from within a single memory column without wait states. They operate by putting an address on the chips address line, then send the CAS bar signal. Once the column is registered (found) a new set of address could then be set to indicate a valid row and this is by activating RAS or while holding the CAS or to indicate the column remain constant. Note: the on state means it could be doing something say, carrying data but not at the address itself. This means the work at address is off; hence RAS bar and CAS bar. PAGE MODE RAM This allows wait less repeated access to bits within a single page of memory (row). In this page- mode RAM the memory controller first sends out a row address then activates RAS signals. While holding the RAS active it then sends out a new address and the CAS signal to indicate a specific cell. If the reverse RAS signal (RAS bar) is kept active the Prepared by Dr. Mike Conrad 2012 Page 41
controller can then send out one or more additional new addresses each followed by the pulse of the CAS to indicate additional cells within the same row. This kind of memory chip that permits this sort of operation is called page mode RAM. The advantage of this design is that a computer can rapidly access multiple cells in a single memory page. The access time within pages can be trimmed to 25-30 nanoseconds fast enough to eliminate wait states. However if the computer is to shift pages (rows) the rows and column addresses must be changed which reduces speed to say 70- 80 nanoseconds. EXTENDED DATA OUT MEMORY (EDO RAM) EDO RAM is a variation of fast page mode RAM. Unlike the conventional memory chip (ordinary) which discharges after each operation requires recharging time before it can read again. EDO RAM keeps its data valid until it receives an additional signal. It modifies the allowed timing for the reversed CAS (CAS bar) signals, the data line remains valid for a short period after the vast CAS line switches off. As a result the system need not wait for a separate read cycle but can read or write data as fast as the chip will allow address access. It does not have to wait for the data to appear before starting the next access but can read it immediately -normally most chips require ten nanosec wait period between issuing the column address). The EDO RAM design eliminates this wait allowing the memory to deliver data to a system faster. Standard page mode chips turn off the data when the CAS line switches off. For this to work however the PC has to indicate when it has finished reading data. In EDO design on the other hand the memory controller signals with the output enable signals. By removing the waitstates EDORAM boosts performance and in theory it could give performance boosts as high as 50- 60%. In reality the best EDO implementation boost performance is 10- 20%
BURST EDO RAM It performs read/ write operations in 4 cycle bursts. This technology is also known as pipeline nibble mode- DRAM because it uses a data pipeline to retrieve and send out the data in a burst and it works the same way as EDO. It works just as ordinary EDO or page mode DRAM in that they send out data when the CAS line goes active. However, instead of sending a single byte of data (depending on the width of the chip) a 2- bit counter pulses the chip internally 4 times; each pulse dealing out one byte of data. The BEDO chip physically differs from page mode / EDO chip but its internal design of the silicon of these chips holding a fuse. It is the fuse that determines whether this chip is a BEDO, if a chip blew after manufacturing then it changes to EDO irreversibly. The BEDO operates with zero wait states in buses at about twice the speed as similarly rated page- mode chips. Currently the BEDO technology is operating at 66 MHZ bus speed with zero wait states using chips rated at 52nanosec compared to page mode chips which operate at zero wait state, but only as fast as 33MHZ. Synchronous Dram (SDRAM) Normal memory chip addressing requires alternating cycle when working with their host microprocessor, but because of their multiplex operations they cannot operate in alternate cycles. Prepared by Dr. Mike Conrad 2012 Page 42
By redesigning the basic chip interface, however, memory chips can make data available every clock cycle. Because these resulting chips can and should operate in synchronous times with their computer host, they are termed as synchronous DRAM. Note that this alteration of the chip removed system bottlenecks but it may not make the chip perform better. Therefore in order for SDRAM to keep up with their quicker interface they also use a pipeline design which means that they are built with multiple independently operating stages so that a chip can start to access a second address before it finishes processing the first. This pipelining extends only across column addresses within a given page. SDRAM chips are rated with high speeds (that is low number of nanosec). The interface with pipelining helps the synchronous DRAM chip achieve ratings as fast as 10nanosec a speed enough to allow them serve 100MHZ memory buses. However, due to other problems typically chips the speed is limited to 66HZ. ENHANCED DRAM (EDRAM) EDRAM makes ordinary dynamic Random Access Memory perform much faster by adding a small block of static memory to its chip. This cache operates at high speed (about 15 nanosec) therefore it can supply data requests made by processors without adding wait states while the balance of the ram in chip gets refreshed.
Advantages of EDRAM The linking of the SRAM cache with DRAM on the small slab of the silicon allows the use of a wide bus to connect the two interfaces. The chip design results in a potential cache filled rate of about 60GB per second compared to 110MB per second rate achieved by standard cache page mode DRAM. As a result filling the old chip secondary cache requires only 35nanoseconds compared to cached non interlined page mode memory which requires 250 nanoseconds. The EDRAM uses different control structures from conventional DRAM chips. The control structures allow a DRAM to be pre-charged at the same time the system makes burst mode reads from the own chip cache. This pre- charging helps prepare DRAM for cache misses, minimizing memory cache time compared to conventional DRAM which must perform both the pre-charged and ordinary access upon the occurrence of a cache miss.` _ Note: if data is in EDRAM cache it is a hit; if outside it is a miss.

Computer Organization-Revised For Slau

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computer Organization-Revised For Slau

Uploaded by

Copyright:

Available Formats

COMPUTER ORGANIZATION

Prepared by Dr. Mike Conrad 2012 Page 4

hardware Level 0 Digital logic level

VON NEWMANN MACHINES

5. VRE 6. VRT 7. STD -

THE AMD CPU Prepared by Dr. Mike Conrad 2012 Page 9

Prepared by Dr. Mike Conrad 2012 Page 10

ORGANIZATION OF A SIMPLE COMPUTER

Arithmetic Logic Unit I/0 devices

REGISTERS Main Memory Disk Printer

CPU ORGANIZATION DATA PATH OF A TYPICAL VON NEUMANN CPU

ALU input register A B ALU input bus

ALU output register

Prepared by Dr. Mike Conrad 2012 Page 12

S5 Write back unit

The figure above illustrates a pipeline with 5 units (stages) needed.

S2 Instruction decode unit

S3 Operand fetch unit

S4 Instruction execution unit

S5 Write back unit

Write back unit

Prepared by Dr. Mike Conrad 2012 Page 16

Cathode Ray Tube (CRT) MONITORS Electron gun

Spot on screen Vacuum

Prepared by Dr. Mike Conrad 2012 Page 17

iv) Between accumulators

After shifting bits to left 0 1 1 0 0 1 0

Prepared by Dr. Mike Conrad 2012 Page 20

Prepared by Dr. Mike Conrad 2012 Page 21

Prepared by Dr. Mike Conrad 2012 Page 23

Prepared by Dr. Mike Conrad 2012 Page 25

INSTRUCTION WITH INTERRUPTS Start

Fetch next instruction

Check for interrupt process interrupt

I/O Module Internal Data

M- Ports Internal Data

Instruction Control signal Data CPU

Internal signal Data

Prepared by Dr. Mike Conrad 2012 Page 29

Processor Local bus

Local I/O Controller

Expansion bus Prepared by Dr. Mike Conrad 2012 Page 32

High speed bus

Expansion bus interface Modem Serial

Prepared by Dr. Mike Conrad 2012 Expansion bus Page 33

Prepared by Dr. Mike Conrad 2012 Page 35

Prepared by Dr. Mike Conrad 2012 Page 36

Prepared by Dr. Mike Conrad 2012 Page 37

Prepared by Dr. Mike Conrad 2012 Page 43

You might also like