توضیحاتی در مورد کتاب Computer Organization & Architecture: Themes and Variations
نام کتاب : Computer Organization & Architecture: Themes and Variations
ویرایش : 4
عنوان ترجمه شده به فارسی : سازمان و معماری کامپیوتر: مضامین و تغییرات
سری :
نویسندگان : Alan Clements
ناشر : Nelson Education
سال نشر : 2013
تعداد صفحات : 938
ISBN (شابک) : 9781285415420
زبان کتاب : English
فرمت کتاب : pdf
حجم کتاب : 831 مگابایت
بعد از تکمیل فرایند پرداخت لینک دانلود کتاب ارائه خواهد شد. درصورت ثبت نام و ورود به حساب کاربری خود قادر خواهید بود لیست کتاب های خریداری شده را مشاهده فرمایید.
فهرست مطالب :
Front Cover
Table of Contents
Preface
Paths through Computer Architecture
About the Author
Part I - The Beginning
1 - Computer Systems Architecture
Where We\'re At
1.1 - What is Computer Systems Architecture?
What is a Computer?
1.2 - Architecture and Organization
1.2.1 - Computer Systems and Technology
1.2.2 - The Role of Computer Architecture in Computer Science
1.3 - Development of Computers
1.3.1 - Mechanical Computers
1.3.2 - Electromechanical Computers
1.3.3 - Early Electronic Computers
1.3.4 - Minicomputers and the PC Revolution
1.3.5 - Moore\'s Law and the March of Progress
1.3.6 - The March of Memory Technology
1.3.7 - Ubiquitous Computing
1.3.8 - Multimedia Computers
1.4 - The Stored Program Computer
1.4.1 - The Problem
1.4.2 - The Solution
1.4.3 - Constructing an Algorithm
1.4.4 - What Does a Computer Need to Solve a Problem?
1.4.5 - The Memory
1.5 - The Stored Program Concept
Two Address Instructions
One Address Instructions
Computer Categories
1.6 - Overview of the Computer System
1.6.1 - The Memory Hierarchy
1.6.2 - The Bus
1.7 - Modern Computing
Summary
Problems
2 - Computer Arithmetic and Digital Logic
2.1 - What is Data?
2.1.1 - The Bit and Byte
2.1.2 - Bit Patterns
Representing Information
2.2 - Numbers
2.2.1 - Positional Notation
2.3 - Binary Arithmetic
2.4 - Signed Integers
2.4.1 - Sign and Magnitude Representation
2.4.2 - Two\'s Complement Arithmetic
Calculating Two\'s Complement Values
Properties of Two\'s Complement Numbers
Arithmetic Overflow
2.5 - Introduction to Multiplication and Division
2.5.1 - Shifting Operations
2.5.2 - Unsigned Binary Multiplication
2.5.3 - High-speed Multiplication
Booth\'s Algorithm
2.5.4 - Division
Restoring Division
Non-Restoring Division
2.6 - Floating-Point Numbers
Normalization of Floating-Point Numbers
Biased Exponents
2.6.1 - IEEE Floating-Point Numbers
IEEE Floating-Point Format
Characteristics of IEEE Floating-Point Numbers
2.7 - Floating-Point Arithmetic
Rounding and Truncation Errors
2.8 - Floating-Point Arithmetic and the Programmer
2.8.1 - Error Propagation in Floating-Point Arithmetic
2.8.2 - Generating Mathematical Functions
Using Functions to Generate New Functions
2.9 - Computer Logic
2.9.1 - Digital Systems and Gates
2.9.2 - Gates
Fundamental Gates
The AND Gate
The OR Gate
The Inverter
Derived Gates-the NOR (Not OR), NAND (Not AND), and Exclusive OR
2.9.3 - Basic Circuits
The Half Adder and Full Adder
The Decoder
The Multiplexer
The Voting Circuit
The Prioritizer
2.10 - Sequential Circuits
2.10.1 - Latches
Clocked RS Flip-flops
D Flip-flop
The JK Flip-Flop
2.10.2 - Registers
Shift Register
Left-Shift Register
2.10.3 - Asynchronous Counters
Using a Counter to Create a Sequencer
2.10.4 - Sequential Circuits
2.11 - Buses and Tristate Gates
Registers, Buses, and Functional Units
Summary
Problems
Part II - Instruction Set Architectures
3 - Architecture and Organization
3.1 - Introduction to the Stored Program Machine
3.1.1 - Extending the Processor: Dealing with Constants
3.1.2 - Extending the Processor: Flow Control
Status Information
Example of a Branch Instruction
3.2 - The Components of an ISA
3.2.1 - Registers
General-Purpose Versus Special-Purpose Registers
3.2.2 - Addressing Modes-an Overview
Memory and Register Addressing
3.2.3 - Instruction Formats
3.2.4 - Op-codes and Instructions
Two Address Machines
One Address Machines
Zero Address Machines
One-and-a-Half Address Machines
3.3 - ARM Instruction Set Architecture
3.3.1 - ARM\'s Register Set
3.3.2 - ARM\'s Instruction Set
3.4 - ARM Assembly Language
3.4.1 - Structure of an ARM Program
3.4.2 - The Assembler - Practical Considerations
3.4.3 - Pseudoinstructions
3.5 - ARM Data-processing Instructions
3.5.1 - Arithmetic Instructions
Addition and Subtraction
Negation
Comparison
Multiplication
Division
3.5.2 - Bitwise Logical Operations
3.5.3 - Shift Operations
Arithmetic Shift
Rotate
Implementing a Shift Operation on the ARM
3.5.4 - Instruction Encoding-An Insight Into the ARM\'s Architecture
3.6 - ARM\'s Flow Control Instructions
3.6.1 - Unconditional Branch
3.6.2 - Conditional Branch
3.6.3 - Compare and Test Instructions
3.6.4 - Branching and Loop Constructs
The FOR Loop
The WHILE Loop
The UNTIL loop
Combination Loop
3.6.5 - Conditional Execution
3.7 - ARM Addressing Modes
3.7.1 - Literal Addressing
ARM\'sWay
3.7.2 - Register Indirect Addressing
3.7.3 - Register Indirect Addressing with an Offset
3.7.4 - ARM\'s Autoindexing Pre-indexed Addressing Mode
3.7.5 - ARM\'s Autoindexing Post-Indexing Mode
3.7.6 - Program Counter Relative (PC-Relative) Addressing
3.7.7 - ARM\'s Load and Store Encoding
3.8 - Subroutine Call and Return
3.8.1 - ARM Support for Subroutines
3.8.2 - Conditional Subroutine Calls
3.9 - Intermission: Examples of ARM Code
3.9.1 - Extracting the Absolute Value
3.9.2 - Byte Manipulation and Concatenation
3.9.3 - Byte Reversal
3.9.4 - Multiplication by 2n - 1 or 2n + 1
3.9.5 - The Use of Multiple Conditions
3.9.6 - With Just One Instruction ...
3.9.7 - Implementing Multiple Selection
3.9.8 - Simple Bit-Level Logical Operations
3.9.9 - Hexadecimal Character Conversion
3.9.10 - Character Output in Hexadecimal
3.9.11 - To Print a Banner
3.10 - Subroutines and the Stack
3.10.1 - Subroutine Call and Return
3.10.2 - Nested Subroutines
3.10.3 - Leaf Routines
3.11 - Data Size and Arrangement
3.11.1 - Data Organization and Endianism
3.11.2 - Data Organization and the ARM
3.11.3 - Block Move Instructions
Block Moves and Stack Operations
Applications of Block Move Instructions
3.12 - Consolidation-Putting Things Together
Four-Function Calculator Program
Summary
Problems
4 - Instruction Set Architectures-Breadth and Depth
Historical Background
4.1 - The Stack and Data Storage
4.1.1 - Storage and the Stack
The Stack Frame and Local Variables
Example of an ARM Processor Stack Frame
4.1.2 - Passing Parameters via the Stack
Pointers and C
Functions and Parameters
Pass-by-Reference
Using Recursion
4.2 - Privileged Modes and Exceptions
4.3 - MIPS: Another RISC
MIPS Instruction Format
Conditional Branches
4.3.1 - MIPS Data Processing Instructions
Flow Control
MIPS Example
Other Loads and Stores
MIPS and the ARM Processor
4.4 - Data Processing and Data Movement
4.4.1 - Indivisible Exchange Instructions
4.4.2 - Double-Precision Shifting
4.4.3 - Pack and Unpack Instructions
4.4.4 - Bounds Testing
4.4.5 - Bit Field Data
4.4.6 - Mechanizing the Loop
4.5 - Memory Indirect Addressing
Using Memory Indirect Addressing to Implement a switch Construct
Using Memory Indirect Addressing to Access Records
4.6 - Compressed Code, RISC, Thumb, and MIPS16
4.6.1 - Thumb ISA
Design Decisions
4.6.2 - MIPS16
4.7 - Variable-Length Instructions
Decoding Variable-Length Instructions
Summary
Problems
5 - Computer Architecture and Multimedia
5.1 - Applications of High-Performance Computing
Computer Graphics
5.1.1 - Operations On Images
Noise Filtering
Contrast Enhancement
Edge Enhancement
Lossy Compression
JPEG
MPEG
MP3
Digital Signal Processing
DSP Architectures
The SHARC Family of Digital Signal Processors
5.2 - Multimedia Influences-Reinventing the CISC
Architectural Progress
5.3 - Introduction to SIMD Processing
Packed Operations
Saturating Arithmetic
Packed Shifting
Packed Multiplication
Parallel Comparison
Packing and Unpacking
Coexisting with Floating-Point
5.3.1 - Applications of SIMD Technology
Chroma Keying
Fade In and Out
Clipping
5.4 - Streaming Extensions and the Development of SIMD Technology
5.4.1 - Floating-point Software Extensions
5.4.2 - Intel\'s Third Layer of Multimedia Extensions
5.4.3 - Intel\'s SSE3 and SSE4 Instructions
5.4.4 - ARM Family Multimedia Instructions
Summary
Problems
Part III - Organization and Efficiency
6 - Performance-Meaning and Metrics
6.1 - Progress and Computer Technology
Moore\'s Law
Semiconductor Progress
Memory Progress
6.2 - The Performance of a Computer
6.3 - Computer Metrics
6.3.1 - Terminology
Efficiency
Throughput
Latency
Relative Performance
Time and Rate
6.3.2 - Clock Rate
The Clock and the Consumer
6.3.3 - MIPS
Instruction Cycles and MIPS
6.3.4 - MFLOPS
6.4 - Amdahl\'s Law
Examples of the Use of Amdahl\'s Law
6.5 - Benchmarks
LINPACK and LAPACK
Oracle Applications Standard Benchmark
PC Benchmarks
Comparison of High-Performance Processors
PCMARK7 A Commercial Benchmark for PCs
6.6 - SPEC
SPEC Methodology
The SPEC CPU2006 Benchmarks
SPEC and Power
6.7 - Averaging Metrics
Geometric Mean
Harmonic Mean
Weighted Means
Summary
Problems
7 - Processor Control
7.1 - The Generic Digital Processor
7.1.1 - The Microprogram
Modifying the Processor Organization
7.1.2 - Generating the Microoperations
7.2 - RISC Organization
7.2.1 - The Register-to-register Data Path
Load and Store operations
Jump and Branch Operations
7.2.2 - Controlling the Single-cycle Flow-through Computer
Execution Time
7.3 - Introduction to Pipelining
7.3.1 - Speedup Ratio
7.3.2 - Implementing Pipelining
From PC to Operands
Implementing Branch and Literal Operations
7.3.3 - Hazards
Delayed Branch
Data Hazards
7.4 - Branches and the Branch Penalty
7.4.1 - Branch Direction
7.4.2 - The Effect of a Branch on the Pipeline
7.4.3 - The Cost of Branches
7.4.4 - The Delayed Branch
7.5 - Branch Prediction
Static and Dynamic Branch Prediction
7.6 - Dynamic Branch Prediction
7.6.1 - Branch Target Buffer
7.6.2 - Two-Level Branch Prediction
Combining Instruction Addresses and Branch History
Summary
Problems
8 - Beyond RISC: Superscalar, VLIW, and ltanium
Overview of Chapter 8
8.1 - Superscalar Architecture
In-Order and Out-of-Order Execution
8.1.1 - Instruction Level Parallelism (ILP)
Data Dependencies and Register Renaming
8.1.2 - Superscalar Instruction Issue
Control Dependencies
Examples of Superscalar Processors
The Alpha
The Pentium
8.1.3 - VLIW Processors
Interrupts and Superscalar Processing
8.2 - Binary Translation
The IA-32 code
8.2.1 - The Transmeta Crusoe
8.3 - EPIC Architecture
8.3.1 - Itanium Overview
IA64 Assembler Conventions
8.3.2 - The Itanium Register Set
The Not a Thing Bit
Predicate and Branch Registers
Other ltanium Registers
8.3.3 - IA64 Instruction Format
8.3.4 - IA64 Instructions and Addressing Modes
Addressing Modes
8.3.5 - Instructions, Bundles, and Breaks
IA64 Bundles, STOPs, and Assembly Language Notation
8.3.6 - Itanium Organization
The McKinley-The Itanium 2
The ltanium 9300 Tukwila Processor
The ltanium Poulson Processor
Is the IA64 a VLIW Processor?
8.3.7 - Predication
Compare Instructions in Detail
Preventing False Data Dependency in Predicated Computing
Branch Syntax
8.3.8 - Memory Access and Speculation
Control Speculation
The Advanced Load
8.3.9 - The IA64 and Software Pipelining
Registers and Function Calls
Summary
Problems
Part IV - The System
9 - Cache Memory and Virtual Memory
Memory Hierarchy
9.1 - Introduction to Cache Memory
9.1.1 - Structure of Cache Memory
Principle of Locality of Reference
9.2 - Performance of Cache Memory
9.3 - Cache Organization
9.3.1 - Fully Associative Mapped Cache
Associative Memory
9.3.2 - Direct-Mapped Cache
9.3.3 - Set-Associative Cache
9.3.4 - Pseudo-Associative, Victim, Annex, and Trace Caches
9.4 - Considerations in Cache Design
9.4.1 - Physical versus Logical Cache
9.4.2 - Cache Electronics
9.4.3 - Cache Coherency
9.4.4 - Line Size
9.4.5 - Fetch Policy
9.4.6 - Multi-Level Cache Memory
9.4.7 - Instruction and Data Caches
9.4.8 - Writing to Cache
9.5 - Virtual Memory and Memory Management
9.5.1 - Memory Management
9.5.2 - Virtual Memory
Memory Management and Multitasking
Address Translation
Two-Level Tables
Summary
Problems
10 - Main Memory
10.1 - Introduction
10.1.1 - Principles and Parameters of Memory Systems
Random Access and Sequential Access Memory
Volatile and Nonvolatile Memory
Read/Write and Read-Only Memory
Static and Dynamic Memory
Memory Parameters
10.1.2 - Memory Hierarchy
10.2 - Primary Memory
10.2.1 - Static RAM
The Static RAM Memory System
The Write Cycle
Byte/Word Control
Address Decoding
10.2.2 - Interleaved Memory
10.3 - DRAM
10.3.1 - DRAM Timing
Write-Cycle Timing
10.3.2 - Developments in DRAM Technology
SDRAM
DDRDRAM
DDR2 and DDR3 DRAM
DDR4
10.4 - The Read-Only Memory Family
10.4.1 - The EPROM Family
TheEEPROM
Flash Memory
Multi-Level Flash Technology
NANO and NOR Flash
Wear Leveling in Flash Memories
10.5 - New and Emerging Nonvolatile Technologies
10.5.1 - Ferroelectric Hysteresis
10.5.2 - MRAM-Magnetoresistive Random Access Memory
10.5.3 - Ovonic Memory
Summary
Problems
11 - Secondary Storage
11.1 - Magnetic Disk Drives
11.2 - Magnetism and Data Storage
11.2.1 - The Read/Write Head
The Recording Process
11.2.2 - Limits to Magnetic Recording Density
11.2.3 - Principles of Data Recording on Disk
Platter Technology
The GMR Head-A Giant Step in Read-Head Technology
Pixie Dust
The Optically Assisted Head
11.3 - Data Organization on Disk
11.3.1 - Tracks and Sectors
Formatting a Disk
Interleaving
11.3.2 - Disk Parameters and Performance
Accessing Sectors
The Internal Disk Cache
Transfer Rate
11.3.3 - SMART Technology
Effect of Temperature on Disk Reliability
11.4 - Secure Memory and RAID Systems
RAID Level 1
RAID Level 2 and Level 3
RAID Level 4 and Level 5
Failure of RAID 5-An Example
RAID Level 6
11.5 - Solid-State Disk Drives
Special Features of SSDs
11.6 - Magnetic Tape
11.7 - Optical Storage Technology
11.7.1 - Digital Audio
11.7.2 - Reading Data from a CD
Disk Speed
The Optical Read-Head
Focusing and Tracking
Buffer Underrun
11.7.3 - Low-Level Data Encoding
11.7.4 - Recordable Disks
Re-Writable CDs
Magneto-Optical Storage
11.7.5 - The DVD
Recordable DVDs
11.7.6 - Blu-ray
Summary
Problems
12 - Input/Output
12.1 - Fundamental Principles of 1/0
Memory-Mapped Peripherals
12.1.1 - Peripheral Register Addressing Mechanisms
12.1.2 - Peripheral Access and Bus Width
Preserving Order in 1/0 Operations
Side Effects
12.2 - Data Transfer
12.2.1 - Open-Loop Data Transfers
12.2.2 - Closed-Loop Data Transfers
12.2.3 - Buffering Data
The FIFO
12.3 - 1/0 Strategy
12.3.1 - Programmed 1/0
12.3.2 - Interrupt-driven 1/0
Interrupt Processing
Nonmaskable Interrupts
Prioritized Interrupts
Nested Interrupts
Vectored Interrupts
Interrupt Timing
12.3.3 - Direct Memory Access
12.4 - Performance of 1/0 Systems
12.5 - The Bus
12.5.1 - Bus Structures and Topologies
12.5.2 - The Structure of a Bus
The Data Bus
Bus Speed
The Address Bus
The Control Bus
12.6 - Arbitrating for the Bus
12.6.1 - Localized Arbitration and the VMEbus
Releasing the Bus
The Arbitration Process
VMEbus Arbitration Algorithms
12.6.2 - Distributed Arbitration
NuBus Arbitration
12.7 - The PCI and PCle Buses
12.7.1 - The PCI Bus
Data Transactions on the PCI Bus
12.7.2 - The PCI Express Bus
PCie Data Link Layer
12.7.3 - CardBus, the PC Card, and ExpressCard
CardBus Cards
ExpressCard Cards
12.8 - The SCSI and SAS Interfaces
SCSI Signals
SCSI Bus Transactions
SCSI Messages and Commands
12.9 - Serial Interface Buses
12.9.1 - The Ethernet
12.9.2 - Fire Wire 1394 Serial Bus
Serial Bus Addressing
The Physical Layer
Arbitration
Initialization
The Link Layer
12.9.3 - USB
USB - The First Two Generations
Electrical Characteristics
Physical Layer Data Transmission
Logical Layer
USB 3.0
Summary
Problems
Part V - Processor-Level Parallelism
13 - Processor-Level Parallelism
Dimensions of Parallel Processing
A Brief History of Parallel Computing
13.1 - Why Parallel Processing?
13.1.1 - Power-The Final Frontier
13.2 - Performance Revisited
Performance Measurement
13.3 - Flynn\'s Taxonomy and Multiprocessor Topologies
13.4 - MultiprocessorTopologies
13.5 - Memory in Multiprocessor Systems
13.5.1 - NUMA Architectures
13.5.2 - Cache Coherency in Multiprocessor Systems
The MESI Protocol
False Sharing
13.6 - Multithreading
13.7 - Multi-core Processors
Homogeneous and Heterogeneous Processors
13.7.1 - Homogeneous Multiprocessors
Intel Nehalem Multi-Core Processor
AMD Multi-Core Processors
ARM Cortex A9 Multi Core
IBM Power?
The GPU
13.7.2 - Heterogeneous Multiprocessors
The Cell Architecture
13.7.3 - Networks on a Chip
13.8 - Parallel Programming
13.8.1 - Parallel Processing and Programming
OpenMP
13.8.2 - Message Passing Interface
13.8.3 - Partitioned Global Address Space
13.8.4 - Synchronization
The Spinlock
Bibliography
Index
Summary
Problems