An instruction
set, or instruction set architecture (ISA), is the part of the computer architecture related to programming,
including the native data types, instructions, registers, addressing
modes, memory architecture, interrupt and
exception handling, and external I/O.
An ISA includes a specification of the set of opcodes (machine
language), and the native commands implemented by a particular processor.
Instruction set
architecture is distinguished from the microarchitecture,
which is the set of processor design techniques used to implement the
instruction set. Computers with different microarchitectures can share a common
instruction set. For example, the Intel Pentium and the AMD Athlon implement
nearly identical versions of the x86 instruction set, but have radically different internal
designs.
This concept
can be extended to unique ISAs like TIMI (Technology-Independent Machine
Interface) present in the IBM System/38 and IBM AS/400. TIMI
is an ISA that is implemented by low-level software translating TIMI code into
"native" machine code, and functionally resembles what is now
referred to as a virtual machine. It was designed to increase the
longevity of the platform and applications written for it, allowing the entire
platform to be moved to very different hardware without having to modify any
software except that which translates TIMI into native machine code, and the
code that implements services used by the resulting native code. This allowed
IBM to move the AS/400
platform from an older CISC architecture to the newer POWER
architecture without having to rewrite or recompile any parts of the OS or
software associated with it other than the aforementioned low-level code. Some
virtual machines that support bytecode for Smalltalk,
the Java virtual machine, and Microsoft's Common Language Runtime virtual machine as
their ISA implement it by translating the bytecode for commonly-used code paths
into native machine code, and executing less-frequently-used code paths by
interpretation; Transmeta implemented the x86 instruction set atop VLIW processors in the
same fashion.
[edit] Machine language
Machine
language is built up from discrete statements or instructions. On
the processing architecture, a given instruction may specify:
- Particular
registers for arithmetic, addressing, or
control functions
- Particular
memory locations or offsets
- Particular
addressing modes used to interpret the
operands
More complex
operations are built up by combining these simple instructions, which (in a von Neumann architecture) are executed
sequentially, or as otherwise directed by control
flow instructions.
[edit] Instruction types
Some operations
available in most instruction sets include:
- Data
handling and Memory operations
- set a register (a temporary
"scratchpad" location in the CPU itself) to a fixed constant
value
- move data
from a memory location to a register, or vice versa. This is done to
obtain the data to perform a computation on it later, or to store the
result of a computation.
- read and write
data from hardware devices
- Arithmetic and Logic
- add, subtract,
multiply, or divide the values of two registers, placing the
result in a register
- perform bitwise operations, taking the conjunction and disjunction of corresponding bits in
a pair of registers, or the negation of each bit in a register
- compare two
values in registers (for example, to see if one is less, or if they are
equal)
- Control
flow
- branch to another location in the
program and execute instructions there
- conditionally branch to
another location if a certain condition holds
- indirectly branch to another location, but save
the location of the next instruction as a point to return to (a call)
[edit] Complex instructions
Some computers
include "complex" instructions in their instruction set. A single
"complex" instruction does something that may take many instructions
on other computers. Such instructions are typified by instructions that take
multiple steps, control multiple functional units, or otherwise appear on a larger
scale than the bulk of simple instructions implemented by the given processor.
Some examples of "complex" instructions include:
- saving
many registers on the stack at once
- moving
large blocks of memory
- complex
and/or floating-point arithmetic (sine, cosine, square
root, etc.)
- performing
an atomic test-and-set instruction
- instructions
that combine ALU with an operand from memory rather than a register
A complex
instruction type that has become particularly popular recently is the SIMD or
Single-Instruction Stream Multiple-Data Stream operation or vector
instruction, an operation that performs the same arithmetic operation on
multiple pieces of data at the same time. SIMD have the ability of manipulating
large vectors and matrices in minimal time. SIMD instructions allow easy parallelization
of algorithms commonly involved in sound, image, and video processing. Various
SIMD implementations have been brought to market under trade names such as MMX, 3DNow! and AltiVec.
[edit] Parts of an instruction
One instruction
may have several fields, which identify the logical operation to be done, and
may also include source and destination addresses and constant values. This is
the MIPS "Add" instruction which allows selection of source and
destination registers and inclusion of a small constant.
On traditional
architectures, an instruction includes an opcode specifying
the operation to be performed, such as "add contents of memory to
register", and zero or more operand specifiers, which may specify registers, memory locations, or literal data.
The operand specifiers may have addressing
modes determining their meaning or may be in fixed fields.
In very long instruction word (VLIW)
architectures, which include many microcode
architectures, multiple simultaneous opcodes and operands are specified in a
single instruction.
Some exotic
instruction sets do not have an opcode field (such as Transport Triggered Architectures
(TTA) or the Forth virtual machine), only operand(s).
Other unusual "0-operand" instruction sets lack any operand
specifier fields, such as some stack
machines including NOSC [1].
[edit] Instruction length
The size or
length of an instruction varies widely, from as little as four bits in some microcontrollers
to many hundreds of bits in some VLIW systems. Processors used in personal
computers, mainframes, and supercomputers
have instruction sizes between 8 and 64 bits. Within an instruction set,
dfferent instructions may have different lengths. In some architectures,
notably most Reduced Instruction Set Computers
(RISC), instructions are a fixed length, typically corresponding with that
architecture's word size. In other architectures, instructions
have variable length, typically integral multiples of a byte or a halfword.
[edit] Representation
The
instructions constituting a program are rarely specified using their internal,
numeric form; they may be specified by programmers using an assembly
language or, more commonly, may be generated by compilers.
[edit] Design
The design of
instruction sets is a complex issue. There were two stages in history for the
microprocessor. The first was the CISC (Complex Instruction Set Computer) which
had many different instructions. In the 1970s, however, places like IBM did
research and found that many instructions in the set could be eliminated. The
result was the RISC (Reduced Instruction Set Computer), an architecture which
uses a smaller set of instructions. A simpler instruction set may offer the
potential for higher speeds, reduced processor size, and reduced power
consumption. However, a more complex set may optimize common operations,
improve memory/cache
efficiency, or simplify programming.
Some
instruction set designers reserve one or more opcodes for some kind of software interrupt. For example, MOS Technology 6502 uses 00H, Zilog Z80
uses the eight codes C7,CF,D7,DF,E7,EF,F7,FFH[1]
while Motorola 68000 use codes in the range A000..AFFFH.
Fast virtual
machines are much easier to implement if an instruction set meets the Popek and Goldberg
virtualization requirements.
The NOP slide
used in Immunity Aware Programming is much
easier to implement if the "unprogrammed" state of the memory is
interpreted as a NOP.
On systems with
multiple processors, non-blocking synchronization
algorithms are much easier to implement if the instruction set includes support
for something like "fetch-and-increment" or "load linked/store
conditional (LL/SC)" or "atomic compare
and swap".
[edit] Instruction set
implementation
Any given
instruction set can be implemented in a variety of ways. All ways of
implementing an instruction set give the same programming
model, and they all are able to run the same binary executables. The
various ways of implementing an instruction set give different tradeoffs
between cost, performance, power consumption, size, etc.
When designing
the microarchitecture of a processor, engineers use
blocks of "hard-wired" electronic circuitry (often designed
separately) such as adders, multiplexers, counters, registers, ALUs etc. Some
kind of register transfer language is then often
used to describe the decoding and sequencing of each instruction of an ISA
using this physical microarchitecture. There are two basic ways to build a control
unit to implement this description (although many designs use middle ways
or compromises):
- Early
computer designs and some of the simpler RISC computers
"hard-wired" the complete instruction set decoding and
sequencing (just like the rest of the microarchitecture).
- Other
designs employ microcode routines and/or tables to do this—typically
as on chip ROMs and/or PLAs (although separate RAMs have been used
historically).
There are also
some new CPU designs which compile the instruction set to a writable RAM or FLASH
inside the CPU (such as the Rekursiv processor and the Imsys
Cjip),[2]
or an FPGA (reconfigurable computing). The Western
Digital MCP-1600
is an older example, using a dedicated, separate ROM for microcode.
An ISA can also
be emulated in
software by an interpreter. Naturally, due to the
interpretation overhead, this is slower than directly running programs on the
emulated hardware, unless the hardware running the emulator is an order of
magnitude faster. Today, it is common practice for vendors of new ISAs or
microarchitectures to make software emulators available to software developers
before the hardware implementation is ready.
Often the
details of the implementation have a strong influence on the particular
instructions selected for the instruction set. For example, many
implementations of the instruction pipeline only allow a single
memory load or memory store per instruction, leading to a load-store architecture (RISC). For another
example, some early ways of implementing the instruction pipeline led to a delay slot.
The demands of
high-speed digital signal processing have pushed in the opposite
direction—forcing instructions to be implemented in a particular way. For
example, in order to perform digital filters fast enough, the MAC instruction
in a typical digital signal processor (DSP) must be
implemented using a kind of Harvard architecture that can fetch an
instruction and two data words simultaneously, and it requires a single-cycle multiply-accumulate multiplier.
[edit] Code density
In early
computers, program memory was expensive, so minimizing the size of a program to
make sure it would fit in the limited memory was often central. Thus the
combined size of all the instructions needed to perform a particular task, the code
density, was an important characteristic of any instruction set. Computers
with high code density also often had (and have still) complex instructions for
procedure entry, parameterized returns, loops etc. (therefore retroactively
named Complex Instruction Set Computers, CISC). However, more typical, or
frequent, "CISC" instructions merely combine a basic ALU operation,
such as "add", with the access of one or more operands in memory
(using addressing modes such as direct, indirect, indexed etc.). Certain
architectures may allow two or three operands (including the result) directly
in memory or may be able to perform functions such as automatic pointer increment
etc. Software-implemented instruction sets may have even more complex and
powerful instructions.
Reduced
instruction-set computers, RISC, were first widely
implemented during a period of rapidly-growing memory subsystems and sacrifice
code density in order to simplify implementation circuitry and thereby try to
increase performance via higher clock frequencies and more registers. RISC instructions
typically perform only a single operation, such as an "add" of
registers or a "load" from a memory location into a register; they
also normally use a fixed instruction width, whereas a typical CISC instruction
set has many instructions shorter than this fixed length. Fixed-width
instructions are less complicated to handle than variable-width instructions
for several reasons (not having to check whether an instruction straddles a
cache line or virtual memory page boundary[3]
for instance), and are therefore somewhat easier to optimize for speed.
However, as RISC computers normally require more and often longer instructions
to implement a given task, they inherently make less optimal use of bus
bandwidth and cache memories.
Minimal instruction set computers
(MISC) are a form of stack machine, where there are few separate
instructions (16-64), so that multiple instructions can be fit into a single
machine word. These type of cores often take little silicon to implement, so
they can be easily realized in an FPGA or in a multi-core
form. Code density is similar to RISC; the increased instruction density is
offset by requiring more of the primitive instructions to do a task.[citation needed]
There has been research
into executable compression as a mechanism for
improving code density. The mathematics of Kolmogorov complexity describes the
challenges and limits of this.
[edit] Number of operands
Instruction
sets may be categorized by the maximum number of operands explicitly
specified in instructions.
(In the
examples that follow, a, b, and c are (direct or
calculated) addresses referring to memory cells, while reg1 and so on
refer to machine registers.)
- 0-operand
(zero address machines), so called stack
machines: All arithmetic operations take place using the top one or
two positions on the stack; 1-operand push and pop instructions are used
to access memory: push a, push b, add, pop c.
- 1-operand
(one address machines), so called accumulator machines, include most early
computers and many small microcontrollers:
Most instructions specify a single explicit right operand (a register, a
memory location, or a constant) with the implicit accumulator as both destination and
left (or only) operand: load a, add b, store
c. A related class is practical stack
machines which often allow a single explicit operand in arithmetic
instructions: push a, add b, pop c.
- 2-operand
— many CISC and RISC machines fall under this category:
- CISC — load
a,reg1; add reg1,b; store reg1,c
- RISC —
Requiring explicit memory loads, the instructions would be: load a,reg1;
load b,reg2; add reg1,reg2; store reg2,c
- 3-operand,
allowing better reuse of data:.[3]
- CISC — It
becomes either a single instruction: add a,b,c, or more
typically: move a,reg1; add reg1,b,c as most
machines are limited to two memory operands.
- RISC —
Due to the large number of bits needed to encode three registers, this
scheme is typically not available in RISC processors using small 16-bit
instructions; arithmetic instructions use registers only, so explicit
2-operand load/store instructions are needed: load a,reg1; load
b,reg2; add reg1+reg2->reg3; store reg3,c;
- more
operands—some CISC machines permit a variety of addressing modes that
allow more than 3 operands (registers or memory accesses), such as the VAX "POLY"
polynomial evaluation instruction.
Each
instruction specifies some number of operands (registers, memory locations, or
immediate values) explicitly. Some instructions give one or both
operands implicitly, such as by being stored on top of the stack or in an implicit register. When some
of the operands are given implicitly, the number of specified operands in an
instruction is smaller than the arity of the operation. When a "destination operand"
explicitly specifies the destination, the number of operand specifiers in an
instruction is larger than the arity of the operation. Some instruction sets
have different numbers of operands for different instructions.
Input/Output Base Address
From Wikipedia, the free encyclopedia
(Redirected
from PC I/O addressing)
Jump to: navigation, search
In the x86
architecture, an input/output base address is a base
address of an I/O port. In other words, this is the first address of a
range of consecutive I/O port addresses that device uses.
Contents [hide] |
[edit]
Common I/O Base Address Device Assignments in IBM PC compatible computers
This table
represents the common I/O address ranges for device assignments in IBM PC
compatible computers. The base address is the first in each range. Each row of
the table represents a device or chip within the computer system. For example,
the address status port in the LPT device is 0x0001, adding the base address of
LPT1 (0x0378) results in the address of the LPT1 status port being 0x0379.
When there are
two or more identical devices in a computer system, each device would be mapped
to a different base address (e.g. LPT1 and LPT2 for printers).
I/O
address range |
Device |
00 – 1f |
First DMA
controller 8237 A-5 |
20 – 3f |
First Programmable Interrupt Controller,
8259A, Master |
40 – 5f |
Programmable Interval Timer (System
Timer), 8254 |
60 – 6f |
|
70 – 7f |
Real
Time Clock, NMI mask |
80 – 9f |
DMA Page Register, 74LS612 |
87 |
DMA Channel 0 |
83 |
DMA Channel 1 |
81 |
DMA Channel 2 |
82 |
DMA Channel 3 |
8b |
DMA Channel 5 |
89 |
DMA Channel 6 |
8a |
DMA Channel 7 |
8f |
Refresh |
a0 – bf |
Second Programmable Interrupt Controller,
8259A, Slave |
c0 – df |
Second DMA
controller 8237 A-5 |
f0 |
Clear 80287 Busy |
f1 |
Reset 80287 |
f8 – ff |
|
f0 – f5 |
PCjr Disk
Controller |
f8 – ff |
Reserved for future microprocessor
extensions |
100 – 10f |
POS Programmable Option Select
(PS/2) |
110 – 1ef |
System I/O channel |
140 – 15f |
Secondary SCSI host adapter |
170 – 177 |
Secondary Parallel
ATA Disk Controller |
1f0 – 1f7 |
Primary Parallel
ATA Hard Disk Controller |
200 – 20f |
|
210 – 217 |
Expansion Unit |
220 – 233 |
Sound
Blaster and most other sound cards |
278 – 27f |
LPT2 parallel port |
280 – 29f |
|
2b0 – 2df |
Alternate Enhanced Graphics Adapter (EGA) display
control |
2e8 – 2ef |
COM4 serial port |
2e1 |
GPIB/IEEE-488
Adapter 0 |
2e2 – 2e3 |
Data acquisition |
2f8 – 2ff |
COM2 serial port |
300 – 31f |
Prototype Card |
300 – 31f |
Novell NE1000 compatible
Ethernet network interfaces |
300 – 31f |
AMD
Am7990 Ethernet
network interface, irq=5. |
320 – 323 |
ST-506 and
compatible hard disk drive interface |
330 – 331 |
MPU-401 UART on
most sound cards |
340 – 35f |
Primary SCSI host adapter |
370 – 377 |
Secondary floppy
disk drive controller |
378 – 37f |
LPT1 parallel port |
380 – 38c |
Secondary Binary Synchronous Data
Link Control (SDLC) adapter |
388 – 389 |
AdLib Music Synthesizer
Card |
3a0 – 3a9 |
Primary Binary Synchronous Data
Link Control (SDLC) adapter |
3b0 – 3bb |
Monochrome Display Adapter (MDA)
display control |
3bc – 3bf |
MDA LPT parallel port |
3c0 – 3cf |
Enhanced Graphics Adapter (EGA) display
control |
3d0 – 3df |
Color Graphics Adapter (CGA) |
3e8 – 3ef |
COM3 serial port |
3f0 – 3f7 |
Primary floppy
disk drive controller. Primary IDE controller (slave drive) (3F6–3F7h) |
3f8 – 3ff |
COM1 serial port |
cf8 – cfc |
Note: For many
devices listed above the assignments can be changed via jumpers, DIP switches,
or Plug-And-Play
IP address
From Wikipedia,
the free encyclopedia
(Redirected
from IP addressing)
Jump to: navigation, search
For the
Wikipedia user access level, see Wikipedia:User access levels#Anonymous
users.
An Internet Protocol address
(IP address) is a numerical label assigned to each device (e.g.,
computer, printer) participating in a computer
network that uses the Internet
Protocol for communication.[1]
An IP address serves two principal functions: host or network interface identification and location addressing.
Its role has been characterized as follows: "A name indicates
what we seek. An address indicates where it is. A route indicates how to get
there."[2]
The designers of the Internet
Protocol defined an IP address as a 32-bit number[1]
and this system, known as Internet Protocol Version 4 (IPv4), is still in use today.
However, due to the enormous growth of the Internet and
the predicted depletion of available addresses, a new addressing system (IPv6), using 128 bits
for the address, was developed in 1995,[3]
standardized as RFC 2460 in
1998,[4]
and is being deployed world-wide since the mid-2000s.
IP addresses are binary
numbers, but they are usually stored in text files and displayed in human-readable
notations, such as 172.16.254.1 (for IPv4), and 2001:db8:0:1234:0:567:8:1 (for IPv6).
The Internet Assigned Numbers Authority
(IANA) manages the IP address space allocations globally and delegates five regional Internet registries (RIRs) to
allocate IP address blocks to local Internet registries (Internet service providers) and other
entities.
[edit] IP
versions
Two versions of the Internet
Protocol (IP) are in use: IP Version 4 and IP Version 6. (See IP version history for details.) Each version
defines an IP address differently. Because of its prevalence, the generic term IP
address typically still refers to the addresses defined by IPv4.
[edit] IP version 4
addresses
Main article: IPv4#Addressing
Decomposition of
an IPv4 address from dot-decimal notation to its binary value.
In IPv4 an address consists of 32
bits which limits the address
space to 4294967296 (232) possible unique addresses. IPv4
reserves some addresses for special purposes such as private
networks (~18 million addresses) or multicast
addresses (~270 million addresses).
IPv4 addresses are canonically
represented in dot-decimal notation, which consists of four
decimal numbers, each ranging from 0 to 255, separated by dots, e.g.,
172.16.254.1. Each part represents a group of 8 bits (octet) of the address. In
some cases of technical writing, IPv4 addresses may be presented in various hexadecimal,
octal, or binary representations.
[edit] IPv4 subnetting
In the early stages of
development of the Internet Protocol,[1]
network administrators interpreted an IP address in two parts: network number
portion and host number portion. The highest order octet (most significant
eight bits) in an address was designated as the network number and the
remaining bits were called the rest field or host identifier and
were used for host numbering within a network.
This early method soon proved
inadequate as additional networks developed that were independent of the
existing networks already designated by a network number. In 1981, the Internet
addressing specification was revised with the introduction of classful
network architecture.[2]
Classful network design allowed
for a larger number of individual network assignments and fine-grained subnetwork
design. The first three bits of the most significant octet of an IP address
were defined as the class of the address. Three classes (A, B,
and C) were defined for universal unicast
addressing. Depending on the class derived, the network identification was
based on octet boundary segments of the entire address. Each class used
successively additional octets in the network identifier, thus reducing the
possible number of hosts in the higher order classes (B and C).
The following table gives an overview of this now obsolete system.
Historical classful
network architecture |
||||||
Class |
Leading |
Range of |
Network ID |
Host ID |
Number of
networks |
Number of
addresses |
A |
0 |
0 - 127 |
a |
b.c.d |
27 = 128 |
224 = 16777216 |
B |
10 |
128 - 191 |
a.b |
c.d |
214 = 16384 |
216 = 65536 |
C |
110 |
192 - 223 |
a.b.c |
d |
221 = 2097152 |
28 = 256 |
Classful network design served
its purpose in the startup stage of the Internet, but it lacked scalability
in the face of the rapid expansion of the network in the 1990s. The class
system of the address space was replaced with Classless Inter-Domain Routing
(CIDR) in 1993. CIDR is based on variable-length subnet masking (VLSM) to allow
allocation and routing based on arbitrary-length prefixes.
Today, remnants of classful
network concepts function only in a limited scope as the default configuration
parameters of some network software and hardware components (e.g. netmask), and
in the technical jargon used in network administrators' discussions.
[edit] IPv4 private
addresses
Early network design, when global
end-to-end connectivity was envisioned for communications with all Internet
hosts, intended that IP addresses be uniquely assigned to a particular computer
or device. However, it was found that this was not always necessary as private
networks developed and public address space needed to be conserved.
Computers not connected to the
Internet, such as factory machines that communicate only with each other via
TCP/IP, need not have globally-unique IP addresses. Three ranges of IPv4
addresses for private networks were reserved in RFC 1918. These addresses are not
routed on the Internet and thus their use need not be coordinated with an IP
address registry.
Today, when needed, such private
networks typically connect to the Internet through network address translation (NAT).
IANA-reserved
private IPv4 network ranges |
|||
Start |
End |
No. of addresses |
|
24-bit Block (/8 prefix, 1 × A) |
10.0.0.0 |
10.255.255.255 |
16777216 |
20-bit Block (/12 prefix, 16 × B) |
172.16.0.0 |
172.31.255.255 |
1048576 |
16-bit Block (/16 prefix, 256 × C) |
192.168.0.0 |
192.168.255.255 |
65536 |
Any user may use any of the
reserved blocks. Typically, a network administrator will divide a block into subnets;
for example, many home routers automatically use a default
address range of 192.168.0.0 - 192.168.0.255 (192.168.0.0/24).
[edit] IPv4 address
exhaustion
Main article: IPv4 address exhaustion
The IP version 4 address space is
rapidly nearing exhaustion of available and assignable address blocks. As of 27
January 2011[5][6]
[edit] IP version 6
addresses
Main article: IPv6
address
Decomposition of
an IPv6 address from hexadecimal representation to its binary value.
The rapid exhaustion of IPv4
address space, despite conservation techniques, prompted the Internet Engineering Task Force
(IETF) to explore new technologies to expand the Internet's addressing
capability. The permanent solution was deemed to be a redesign of the Internet
Protocol itself. This next generation of the Internet Protocol, intended to
replace IPv4 on the Internet, was eventually named Internet Protocol Version 6
(IPv6) in 1995[3][4]
The address size was increased from 32 to 128 bits or 16 octets.
This, even with a generous assignment of network blocks, is deemed sufficient
for the foreseeable future. Mathematically, the new address space provides the
potential for a maximum of 2128, or about 3.403×1038
unique addresses.
The new design is not intended to
provide a sufficient quantity of addresses on its own, but rather to allow
efficient aggregation of subnet routing prefixes to occur at routing nodes. As
a result, routing table sizes are smaller, and the smallest possible individual
allocation is a subnet for 264 hosts, which is the square of the
size of the entire IPv4 Internet. At these levels, actual address utilization
rates will be small on any IPv6 network segment. The new design also provides
the opportunity to separate the addressing infrastructure of a network segment
— that is the local administration of the segment's available space — from the
addressing prefix used to route external traffic for a network. IPv6 has
facilities that automatically change the routing prefix of entire networks,
should the global connectivity or the routing policy change, without requiring
internal redesign or renumbering.
The large number of IPv6
addresses allows large blocks to be assigned for specific purposes and, where
appropriate, to be aggregated for efficient routing. With a large address
space, there is not the need to have complex address conservation methods as
used in Classless Inter-Domain Routing
(CIDR).
Many modern desktop and
enterprise server operating systems include native support for the IPv6
protocol, but it is not yet widely deployed in other devices, such as home
networking routers, voice over IP (VoIP) and multimedia equipment, and
network peripherals.
[edit] IPv6 private
addresses
Just as IPv4 reserves addresses
for private or internal networks, blocks of addresses are set aside in IPv6 for
private addresses. In IPv6, these are referred to as unique local addresses (ULA). RFC 4193 sets aside the routing
prefix fc00::/7 for this block which is divided into two /8 blocks with
different implied policies (cf. IPv6) The addresses include a 40-bit pseudorandom number that
minimizes the risk of address collisions if sites merge or packets are
misrouted.
Early designs (RFC 3513) used a different block
for this purpose (fec0::), dubbed site-local addresses. However, the definition
of what constituted sites remained unclear and the poorly defined
addressing policy created ambiguities for routing. The address range
specification was abandoned and must not be used in new systems.
Addresses starting with fe80:,
called link-local addresses, are assigned to interfaces
for communication on the link only. The addresses are usually automatically
generated by the operating system for each network interface. This provides
instant automatic network connectivity for any IPv6 host and means that if
several hosts connect to a common hub or switch, they have an instant
communication path via their link-local IPv6 address. This feature is used
extensively, and invisibly to most users, in the lower layers of IPv6 network
administration (cf. Neighbor Discovery Protocol).
None of the private address
prefixes may be routed in the public Internet.
[edit] IP subnetworks
IP networks may be divided into subnetworks
in both IPv4 and IPv6. For this purpose, an IP address is logically recognized
as consisting of two parts: the network prefix and the host
identifier, or interface identifier (IPv6). The subnet mask or the CIDR prefix determines
how the IP address is divided into network and host parts.
The term subnet mask is
only used within IPv4. Both IP versions however use the Classless Inter-Domain Routing
(CIDR) concept and notation. In this, the IP address is followed by a slash and
the number (in decimal) of bits used for the network part, also called the routing
prefix. For example, an IPv4 address and its subnet mask may be 192.0.2.1
and 255.255.255.0, respectively. The CIDR
notation for the same IP address and subnet is 192.0.2.1/24, because the
first 24 bits of the IP address indicate the network and subnet.
[edit] IP address
assignment
Internet Protocol addresses are
assigned to a host either anew at the time of booting, or permanently by fixed
configuration of its hardware or software. Persistent configuration is also
known as using a static IP address. In contrast, in situations when the
computer's IP address is assigned newly each time, this is known as using a dynamic
IP address.
[edit] Methods
Static IP addresses are manually
assigned to a computer by an administrator. The exact procedure varies
according to platform. This contrasts with dynamic IP addresses, which are
assigned either by the computer interface or host software itself, as in Zeroconf, or
assigned by a server using Dynamic Host Configuration Protocol
(DHCP). Even though IP addresses assigned using DHCP may stay the same for long
periods of time, they can generally change. In some cases, a network
administrator may implement dynamically assigned static IP addresses. In this
case, a DHCP server is used, but it is specifically configured to always assign
the same IP address to a particular computer. This allows static IP addresses
to be configured centrally, without having to specifically configure each
computer on the network in a manual procedure.
In the absence or failure of
static or stateful (DHCP) address configurations, an operating system may
assign an IP address to a network interface using state-less auto-configuration
methods, such as Zeroconf.
[edit] Uses of dynamic
addressing
Dynamic IP addresses are most
frequently assigned on LANs and broadband networks by Dynamic Host Configuration Protocol
(DHCP) servers. They are used because it avoids the administrative burden of
assigning specific static addresses to each device on a network. It also allows
many devices to share limited address space on a network if only some of them
will be online at a particular time. In most current desktop operating systems,
dynamic IP configuration is enabled by default so that a user does not need to
manually enter any settings to connect to a network with a DHCP server. DHCP is
not the only technology used to assign dynamic IP addresses. Dialup and some
broadband networks use dynamic address features of the Point-to-Point Protocol.
[edit] Sticky dynamic IP
address
A sticky dynamic IP address
is an informal term used by cable and DSL Internet access subscribers to
describe a dynamically assigned IP address that seldom changes. The addresses
are usually assigned with the DHCP protocol. Since the modems are usually
powered-on for extended periods of time, the address leases are usually set to
long periods and simply renewed upon expiration. If a modem is turned off and
powered up again before the next expiration of the address lease, it will most
likely receive the same IP address.
[edit] Address
autoconfiguration
RFC 3330 defines an address
block, 169.254.0.0/16, for the special use in link-local addressing for IPv4
networks. In IPv6,
every interface, whether using static or dynamic address assignments, also
receives a local-link address automatically in the fe80::/10 subnet.
These addresses are only valid on
the link, such as a local network segment or point-to-point connection, that a
host is connected to. These addresses are not routable and like private
addresses cannot be the source or destination of packets traversing the
Internet.
When the link-local IPv4 address
block was reserved, no standards existed for mechanisms of address
autoconfiguration. Filling the void, Microsoft
created an implementation that is called Automatic Private IP Addressing (APIPA). Due to
Microsoft's market power, APIPA has been deployed on millions of machines and
has, thus, become a de facto standard in the industry. Many years later, the IETF defined a formal
standard for this functionality, RFC
3927, entitled Dynamic Configuration of IPv4 Link-Local Addresses.
[edit] Uses of static
addressing
Some infrastructure situations
have to use static addressing, such as when finding the Domain Name System(DNS) host that will translate
domain
names to IP addresses. Static addresses are also convenient, but not
absolutely necessary, to locate servers inside an enterprise. An address
obtained from a DNS server comes with a time to
live, or caching time, after which it should be looked up
to confirm that it has not changed. Even static IP addresses do change as a
result of network administration (RFC
2072)
[edit] Public addresses
A "public" IP address
in common parlance is synonymous with a static, routable IP address, and the
term does not include dynamic IP addresses, even if they are routable.[citation needed]
Both IPv4 and IPv6 also define
address ranges that are reserved for private
networks (see above), for link-local addressing, and for other purposes.
[edit] Modifications to
IP addressing
[edit] IP blocking and
firewalls
Firewalls perform Internet
Protocol blocking to protect networks from unauthorized access. They are
common on today 's
Internet. They control access to networks based on the IP address of a client
computer. Whether using a blacklist or a whitelist,
the IP address that is blocked is the perceived IP address of the client,
meaning that if the client is using a proxy
server or network address translation, blocking
one IP address may block many individual computers.
[edit] IP address
translation
Multiple client devices can
appear to share IP addresses: either because they are part of a shared
hosting web
server environment or because an IPv4 network address translator (NAT) or proxy
server acts as an intermediary agent on behalf of its customers, in which
case the real originating IP addresses might be hidden from the server
receiving a request.
A common practice is to have a NAT hide a large number of IP addresses in a private
network. Only the "outside" interface(s) of the NAT need to have
Internet-routable addresses.[7]
Most commonly, the NAT device maps
TCP or UDP port numbers on the outside to individual private addresses on the
inside. Just as a telephone number may have site-specific extensions, the port
numbers are site-specific extensions to an IP address.
In small home networks, NAT
functions usually take place in a residential gateway device, typically one
marketed as a "router". In this scenario, the computers connected to
the router would have 'private' IP addresses and the router would have a
'public' address to communicate with the Internet. This type of router allows
several computers to share one public IP address.
Interrupt
From Wikipedia, the free encyclopedia
Jump to: navigation, search
In computing, an
interrupt is an asynchronous signal indicating the need
for attention or a synchronous event in software indicating the need for a
change in execution.
A hardware
interrupt causes the processor to save its state of execution
and begin execution of an interrupt
handler.
Software
interrupts are usually implemented as instructions in the instruction
set, which cause a context switch to an interrupt handler similar to a
hardware interrupt.
Interrupts are
a commonly used technique for computer multitasking, especially in real-time computing. Such a system is said to
be interrupt-driven.[1]
An act of interrupting
is referred to as an interrupt request (IRQ).
Contents [hide] |
[edit] Overview
Hardware
interrupts were introduced as a way to avoid wasting the processor's valuable
time in polling loops, waiting for external
events. They may be implemented in hardware as a distinct system with control
lines, or they may be integrated into the memory subsystem.
If implemented
in hardware, an interrupt controller circuit such as the IBM PC's Programmable Interrupt Controller
(PIC) may be connected between the interrupting device and the processor's
interrupt pin to multiplex several sources of interrupt onto the one or two CPU
lines typically available. If implemented as part of the memory
controller, interrupts are mapped into the system's memory address
space.
Interrupts can
be categorized into: maskable interrupt, non-maskable interrupt (NMI), inter-processor interrupt (IPI), software
interrupt, and spurious interrupt.
- Maskable
interrupt (IRQ) is a hardware interrupt that may be ignored
by setting a bit in an interrupt mask register's (IMR)
bit-mask.
- Non-maskable interrupt (NMI) is
a hardware interrupt that lacks an associated bit-mask, so that it can
never be ignored. NMIs are often used for timers, especially watchdog
timers.
- Inter-processor interrupt (IPI) is
a special case of interrupt that is generated by one processor to
interrupt another processor in a multiprocessor
system.
- Software
interrupt is an interrupt generated within a processor by
executing an instruction. Software interrupts are often used to implement system
calls because they implement a subroutine call with a CPU ring level change.
- Spurious
interrupt is a hardware interrupt that is unwanted. They are
typically generated by system conditions such as electrical interference on an
interrupt line or through incorrectly designed hardware.
Processors
typically have an internal interrupt mask which allows software to
ignore all external hardware interrupts while it is set. This mask may offer
faster access than accessing an interrupt mask register (IMR) in a PIC, or
disabling interrupts in the device itself. In some cases, such as the x86 architecture,
disabling and enabling interrupts on the processor itself act as a memory
barrier, however it may actually be slower.
An interrupt
that leaves the machine in a well-defined state is called a precise
interrupt. Such an interrupt has four properties:
- The
Program Counter (PC) is saved in a known place.
- All
instructions before the one pointed to by the PC have fully executed.
- No
instruction beyond the one pointed to by the PC has been executed (that is
no prohibition on instruction beyond that in PC, it is just that any
changes they make to registers or memory must be undone before the
interrupt happens).
- The
execution state of the instruction pointed to by the PC is known.
An interrupt
that does not meet these requirements is called an imprecise interrupt.
The phenomenon
where the overall system performance is severely hindered by excessive amounts
of processing time spent handling interrupts is called an interrupt
storm.
[edit] Types of Interrupts
[edit] Level-triggered
A level-triggered
interrupt is a class of interrupts where the presence of an unserviced
interrupt is indicated by a high level (1), or low level (0), of the interrupt
request line. A device wishing to signal an interrupt drives line to its
active level, and then holds it at that level until serviced. It ceases
asserting the line when the CPU commands it to or otherwise handles the
condition that caused it to signal the interrupt.
Typically, the
processor samples the interrupt input at predefined times during each bus cycle
such as state T2 for the Z80
microprocessor. If the interrupt isn't active when the processor samples it,
the CPU doesn't see it. One possible use for this type of interrupt is to
minimize spurious signals from a noisy interrupt line: a spurious pulse will
often be so short that it is not noticed.
Multiple
devices may share a level-triggered interrupt line if they are designed to. The
interrupt line must have a pull-down or pull-up resistor so that when not
actively driven it settles to its inactive state. Devices actively assert the
line to indicate an outstanding interrupt, but let the line float (do not
actively drive it) when not signalling an interrupt. The line is then in its
asserted state when any (one or more than one) of the sharing devices is
signalling an outstanding interrupt.
This class of
interrupts is favored by some because of a convenient behavior when the line is
shared. Upon detecting assertion of the interrupt line, the CPU must search
through the devices sharing it until one requiring service is detected. After
servicing this device, the CPU may recheck the interrupt line status to
determine whether any other devices also need service. If the line is now
de-asserted, the CPU avoids checking the remaining devices on the line. Since
some devices interrupt more frequently than others, and other device interrupts
are particularly expensive, a careful ordering of device checks is employed to
increase efficiency.
There are also
serious problems with sharing level-triggered interrupts. As long as any device
on the line has an outstanding request for service the line remains asserted,
so it is not possible to detect a change in the status of any other device.
Deferring servicing a low-priority device is not an option, because this would
prevent detection of service requests from higher-priority devices. If there is
a device on the line that the CPU does not know how to service, then any
interrupt from that device permanently blocks all interrupts from the other devices.
The original PCI standard mandated shareable
level-triggered interrupts. The rationale for this was the efficiency gain
discussed above. (Newer versions of PCI allow, and PCI Express
requires the use of message-signalled interrupts.)
[edit] Edge-triggered
An edge-triggered
interrupt is a class of interrupts that are signalled by a level transition
on the interrupt line, either a falling
edge (1 to 0) or a rising edge (0 to 1). A device wishing to signal an
interrupt drives a pulse onto the line and then releases the line to its
quiescent state. If the pulse is too short to be detected by polled I/O
then special hardware may be required to detect the edge.
Multiple
devices may share an edge-triggered interrupt line if they are designed to. The
interrupt line must have a pull-down or pull-up resistor so that when not
actively driven it settles to one particular state. Devices signal an interrupt
by briefly driving the line to its non-default state, and let the line float
(do not actively drive it) when not signalling an interrupt. This type of
connection is also referred to as open
collector. The line then carries all the pulses generated by all the
devices. (This is analogous to the pull cord on some buses and trolleys that
any passenger can pull to signal the driver that they are requesting a stop.)
However, interrupt pulses from different devices may merge if they occur close
in time. To avoid losing interrupts the CPU must trigger on the trailing edge
of the pulse (e.g. the rising edge if the line is pulled up and driven low).
After detecting an interrupt the CPU must check all the devices for service
requirements.
Edge-triggered
interrupts do not suffer the problems that level-triggered interrupts have with
sharing. Service of a low-priority device can be postponed arbitrarily, and
interrupts will continue to be received from the high-priority devices that are
being serviced. If there is a device that the CPU does not know how to service,
it may cause a spurious interrupt, or even periodic spurious interrupts, but it
does not interfere with the interrupt signalling of the other devices. However,
it is fairly easy for an edge triggered interrupt to be missed - for example if
interrupts have to be masked for a period - and unless there is some type of
hardware latch that records the event it is impossible to recover. Such
problems caused many "lockups" in early computer hardware because the
processor did not know it was expected to do something. More modern hardware
often has one or more interrupt status registers that latch the interrupt
requests; well written edge-driven interrupt software often checks such
registers to ensure events are not missed.
The elderly Industry Standard Architecture (ISA)
bus uses edge-triggered interrupts, but does not mandate that devices be able
to share them. The parallel port also uses edge-triggered interrupts.
Many older devices assume that they have exclusive use of their interrupt line,
making it electrically unsafe to share them. However, ISA motherboards include
pull-up resistors on the IRQ lines, so well-behaved devices share ISA
interrupts just fine.
[edit] Hybrid
Some systems
use a hybrid of level-triggered and edge-triggered signalling. The hardware not
only looks for an edge, but it also verifies that the interrupt signal stays
active for a certain period of time.
A common use of
a hybrid interrupt is for the NMI (non-maskable interrupt) input. Because NMIs
generally signal major – or even catastrophic – system events, a good
implementation of this signal tries to ensure that the interrupt is valid by
verifying that it remains active for a period of time. This 2-step approach
helps to eliminate false interrupts from affecting the system.
[edit] Message-signaled
Main article: Message Signaled Interrupts
A message-signalled
interrupt does not use a physical interrupt line. Instead, a device signals
its request for service by sending a short message over some communications
medium, typically a computer bus. The message might be of a type reserved
for interrupts, or it might be of some pre-existing type such as a memory
write.
Message-signalled
interrupts behave very much like edge-triggered interrupts, in that the
interrupt is a momentary signal rather than a continuous condition.
Interrupt-handling software treats the two in much the same manner. Typically,
multiple pending message-signalled interrupts with the same message (the same
virtual interrupt line) are allowed to merge, just as closely-spaced
edge-triggered interrupts can merge.
Message-signalled
interrupt vectors can be shared, to the extent that the underlying
communication medium can be shared. No additional effort is required.
Because the
identity of the interrupt is indicated by a pattern of data bits, not requiring
a separate physical conductor, many more distinct interrupts can be efficiently
handled. This reduces the need for sharing. Interrupt messages can also be
passed over a serial bus, not requiring any additional lines.
PCI Express,
a serial computer bus, uses message-signalled interrupts
exclusively.
[edit] Doorbell
In a push button
analogy applied to computer systems, the term doorbell or doorbell
interrupt is often used to describe a mechanism whereby a software system
can signal or notify a hardware device that there is some work to be done.
Typically, the software system will place data in some well known and mutually
agreed upon memory location(s), and "ring the doorbell" by writing to
a different memory location. This different memory location is often called the
doorbell region, and there may even be multiple doorbells serving different
purposes in this region. It's this act of writing to the doorbell region of
memory that "rings the bell" and notifies the hardware device that
the data is ready and waiting. The hardware device would now know that the data
is valid and can be acted upon. It would typically write the data to a hard
disk drive, or send it over a network,
or encrypt it,
etc.
The term doorbell
interrupt is usually a misnomer. It's similar to an interrupt because it causes
some work to be done by the device, however the doorbell region is sometimes
implemented as a polled region, sometimes the doorbell
region writes through to physical device registers,
and sometimes the doorbell region is hardwired directly to physical device
registers. When either writing through or directly to physical device
registers, this may, but not necessarily, cause a real interrupt to occur at
the device's central processor unit (CPU), if it has one.
Doorbell
interrupts can be compared to Message Signaled Interrupts, as they
have some similarities.
[edit]
Difficulty with sharing interrupt lines
Multiple devices
sharing an interrupt line (of any triggering style) all act as spurious
interrupt sources with respect to each other. With many devices on one line the
workload in servicing interrupts grows in proportion to the square of the
number of devices. It is therefore preferred to spread devices evenly across
the available interrupt lines. Shortage of interrupt lines is a problem in
older system designs where the interrupt lines are distinct physical
conductors. Message-signalled interrupts, where the interrupt line is virtual,
are favoured in new system architectures (such as PCI Express)
and relieve this problem to a considerable extent.
Some devices
with a badly-designed programming interface provide no way to determine whether
they have requested service. They may lock up or otherwise misbehave if
serviced when they do not want it. Such devices cannot tolerate spurious
interrupts, and so also cannot tolerate sharing an interrupt line. ISA cards, due to often cheap design
and construction, are notorious for this problem. Such devices are becoming
much rarer, as hardware logic becomes cheaper and new system architectures
mandate shareable interrupts.
[edit] Performance issues
Interrupts
provide low overhead and good latency at low offered load, but degrade
significantly at high interrupt rate unless care is taken to prevent several
pathologies. These are various forms of livelocks, when
the system spends all of its time processing interrupts, to the exclusion of
other required tasks. Under extreme conditions, a large number of interrupts
(like very high network traffic) may completely stall the system. To avoid such
problems, an operating system must schedule network interrupt handling as
carefully as it schedules process execution.[2]
[edit] Typical uses
Typical uses of
interrupts include the following: system timers, disks I/O, power-off signals,
and traps. Other interrupts exist to transfer data
bytes using UARTs or
Ethernet;
sense key-presses; control motors; or anything else the equipment must do.
A classic
system timer
generates interrupts periodically from a counter or the power-line. The
interrupt handler counts the interrupts to keep time. The timer interrupt may
also be used by the OS's task scheduler to reschedule the priorities of
running processes. Counters are popular, but some older
computers used the power line frequency instead, because power companies in
most Western countries control the power-line frequency with a very accurate atomic
clock.[citation needed]
A disk
interrupt signals the completion of a data transfer from or to the disk
peripheral. A process waiting to read or write a file starts up again.
A power-off
interrupt predicts or requests a loss of power. It allows the computer
equipment to perform an orderly shut-down.
Interrupts are
also used in typeahead
features for buffering events like keystrokes.
Programmable Interrupt Controller
From Wikipedia, the free encyclopedia
(Redirected
from Interrupt controller)
Jump to: navigation,
search
In computing, a programmable
interrupt controller (PIC) is a device that is used to combine
several sources of interrupt onto one or more CPU lines, while allowing
priority levels to be assigned to its interrupt outputs. When the device has
multiple interrupt outputs to assert, it will assert them in the order of their
relative priority. Common modes of a PIC include hard priorities, rotating
priorities, and cascading priorities. PICs often allow the cascading of their
outputs to inputs between each other.
Contents [hide] |
[edit] Common features
PICs typically
have a common set of registers: Interrupt Request Register (IRR), In-Service
Register (ISR), Interrupt Mask Register (IMR). The IRR specifies which
interrupts are pending acknowledgement, and is typically a symbolic register
which can not be directly accessed. The ISR register specifies which interrupts
have been acknowledged, but are still waiting for an End
Of Interrupt (EOI). The IMR specifies which interrupts are to be ignored
and not acknowledged. A simple register schema such as this allows up to two
distinct interrupt requests to be outstanding at one time, one waiting for
acknowledgement, and one waiting for EOI.
There are a
number of common priority schemas in PICs including hard priorities, specific
priorities, and rotating priorities.
Interrupts may
be either edge triggered or level
triggered.
There are a
number of common ways of acknowledging an interrupt has completed when an EOI
is issued. These include specifying which interrupt completed, using an implied
interrupt which has completed (usually the highest priority pending in the
ISR), and treating interrupt acknowledgement as the EOI.
[edit] Well-known types
One of the best
known PICs, the 8259A, was included in the x86
PC. In modern times, this is not included as a separate chip in an x86 PC.
Rather, its function is included as part of the motherboard's southbridge chipset. In other cases, it has
been replaced by the newer Advanced Programmable
Interrupt Controllers which support more interrupt outputs and more
flexible priority schemas.
Channel I/O
From Wikipedia, the free encyclopedia
(Redirected
from DMA controller)
Jump to: navigation, search
In computer
science, channel I/O is a generic term that refers to a
high-performance input/output (I/O) architecture that is implemented in
various forms on a number of computer architectures, especially on mainframe computers. In the past they were
generally implemented with a custom processor, known alternately as peripheral
processor, I/O processor, I/O controller, or DMA
controller.
Contents [hide] |
[edit] Basic principles
Many I/O tasks
can be complex and require logic to be applied to the data to convert formats
and other similar duties. In these situations, the simplest solution is to ask
the CPU to handle the
logic, but because I/O devices are relatively slow, a CPU could waste time (in
computer perspective) waiting for the data from the device. This situation is
called 'I/O bound'.
Channel
architecture avoids this problem by using a separate, independent, low-cost
processor. Channel processors are simple, but self-contained, with minimal
logic and sufficient on-board scratchpad memory (working storage) to handle I/O
tasks. They are typically not powerful or flexible enough to be used as a
computer on their own and can be construed as a form of coprocessor.
An attached CPU
sends small channel programs to the controller to handle I/O
tasks, which the channel controller can normally complete without further
intervention from the CPU.
When I/O
transfer is complete or an error is detected, the channel controller
communicates with the CPU using an interrupt.
Since the channel controller has direct access to the main memory, it is also
often referred to as DMA controller (where DMA stands for direct memory access), although that term is
looser in definition and is often applied to non-programmable devices as well.
[edit] History
The first use
of channel I/O was with the IBM 709 [1]
vacuum tube mainframe, whose Model 766 Data Synchronizer was the first channel
controller, in 1957. Its transistorized successor, the IBM 7090, [2]
had two or more channels (the 7607) and a channel multiplexor (the 7606) which
could control up to eight channels.
Later, for
larger IBM System/360 computers, and even for early System/370
models, the selector channels and the multiplexor channels still were bulky and
expensive separate processors, such as the IBM 2860
'selector channel' and the IBM 2870
'multiplexer channel'. For the smaller System/360 computers, multiplexor
channels were implemented in CPU's microcode.
Later, the channels were implemented in onboard processors residing in the same
box as the CPU.
One of the
earliest non-IBM channel systems was hosted in the CDC 6600 supercomputer
in 1965. The CDC utilized 10 logically independent computers called peripheral
processors, or PPs for this role. PPs were powerful, a modern version of CDC's
first 'personal computer', the CDC 160A. The operating system resided and executed in the
primary processor, PP0. Since then, channel controllers have been a standard
part of most mainframe designs and a primary advantage mainframes have over
smaller, faster, personal computers and network computing.
Channel
controllers have also been made as small as single-chip designs with multiple
channels on them, used in the NeXT computers for instance. However with the rapid speed
increases in computers today, combined with operating
systems that don't 'block' when waiting for data, channel controllers have
become correspondingly less effective and are not commonly found on small
machines.
Channel
controllers are making a comeback in the form of bus
mastering peripheral devices, such as PCI direct memory access (DMA) devices. The
rationale for these devices is the same as for the original channel
controllers, namely off-loading transfer, interrupts, and context
switching from the main CPU.
[edit] Description
The reference
implementation of channel I/O is that of the IBM System/360 family of
mainframes and its successors, but similar implementations have been adopted by
other mainframe vendors, such as Control
Data, Bull
(General Electric/Honeywell)
and Unisys.
Computer
systems that use channel I/O have special hardware components that handle all
input/output operations in their entirety independently of the systems' CPU(s).
The CPU of a system that uses channel I/O typically has only one machine instruction in its repertoire for input
and output; this instruction is used to pass input/output commands to the
specialized I/O hardware in the form of channel
programs. I/O thereafter proceeds without intervention from the CPU until
an event requiring notification of the operating system occurs, at which point
the I/O hardware signals an interrupt to the CPU.
A channel is an
independent hardware component that coordinates all I/O to a set of controllers
or devices. It is not merely a medium of communication, despite the name; it is
a programmable device that handles all details of I/O after being given
a list of I/O operations to carry out (the channel program).
Each channel
may support one or more controllers and/or devices. Channel programs contain
lists of commands to the channel itself and to various controllers and devices
to which it is connected. Once the operating system has prepared a complete
list of I/O commands, it executes a single I/O machine instruction to initiate
the channel program; the channel thereafter assumes control of the I/O
operations until they are completed.
It is possible
to develop very complex channel programs, initiating many different I/O
operations on many different I/O devices simultaneously. This flexibility frees
the CPU from the overhead of starting, monitoring, and managing individual I/O
operations. The specialized channel hardware, in turn, is dedicated to I/O and
can carry it out more efficiently than the CPU (and entirely in parallel with
the CPU). Channel I/O is not unlike the Direct Memory Access (DMA) of microcomputers,
only more complex and advanced. Most mainframe operating systems do not fully
exploit all the features of channel I/O.
On large
mainframe computer systems, CPUs are only one of several powerful hardware
components that work in parallel. Special input/output controllers (the exact
names of which vary from one manufacturer to another) handle I/O exclusively,
and these in turn are connected to hardware channels that also are dedicated to
input and output. There may be several CPUs and several I/O processors. The
overall architecture optimizes input/output performance without degrading pure
CPU performance. Since most real-world applications of mainframe systems are
heavily I/O-intensive business applications, this architecture helps provide
the very high levels of throughput that distinguish mainframes from other types of
computer.
In IBM ESA/390
terminology, a channel is a parallel data connection inside the tree-like or
hierarchically organized I/O subsystem. In System/390 I/O cages, channels
either directly connect to devices which are installed inside the cage
(communication adapter such as ESCON, FICON, Open Systems Adapter) or they run outside of
the cage, below the raised floor as cables of the thickness of a thumb and
directly connect to channel interfaces on bigger devices like tape subsystems, direct access storage devices (DASDs),
terminal concentrators and other ESA/390 systems.
[edit] Channel Program
A channel
program is a sequence of I/O instructions executed by the input/output
channel processor in the IBM System/360 and subsequent architectures. The
channel program consists of one or more channel command words. The operating
system signals the I/O channel processor to begin executing the channel program
with a SSCH (start sub-channel) instruction. The processor is then free to
proceed with non-I/O instructions until interrupted. When the operations are
complete, the channel posts an interrupt. In earlier models of the IBM
mainframe line, the channel processor was an identifiable component, but in
modern mainframes, the channels are implemented in a microcode running in
multi-core processor-on-a-chip, called System Assistance Processor (SAP).
Hence the earlier SIO (start I/O) and SIOF (start I/O fast release)
assembler instructions are replaced by the SSCH (start
sub-channel) instruction.
Channel I/O
provides considerable economies in input/output. For example, on IBM's Linux/390,
the formatting of an entire track of a DASD requires only one channel program
(and thus only one I/O instruction). The program is executed by the dedicated
I/O processor, while the application processor (the CPU) is free for
other work.
[edit] Channel command words
A channel
command word (CCW) is an instruction for a specialized I/O
channel processor. It is used to initiate an I/O operation on a
channel-attached device, such as “read” or “seek”. On system architectures that
implement channel I/O, typically all devices are connected by channels, and so all
I/O requires the use of CCWs.
CCWs are
organized into channel programs by the operating system, an I/O
subroutine, a utility program, or by standalone software (such as test and
diagnostic programs).
[edit] Booting with channel
I/O
Even bootstrapping
of the system, or Initial program load (IPL) in IBM
nomenclature, is carried out by channels: to load the system, a very small,
simple channel program is loaded into memory and initiated, and this program
causes the first portion of the system loading software to be loaded. The
software is then executed once the I/O is completed, and an interrupt is
signaled to the CPU.
Conventional PCI
From Wikipedia, the free encyclopedia
(Redirected
from PCI bus)
Jump to: navigation, search
This page is currently under the
scrutiny of the Wikimedia Foundation Office and certain
restrictions are in place regarding the content of this article. |
Conventional PCI |
|
PCI Local Bus |
|
|
|
Year created |
July 1993 |
Created by |
|
Supersedes |
|
Superseded by |
PCI Express
(2004) |
Width in bits |
32 or 64 |
Capacity |
133 MB/s
(32-bit at 33 MHz) |
Style |
|
Hotplugging interface |
Optional |
A typical
32-bit, 5 V-only PCI card, in this case a SCSI adapter from Adaptec
Conventional
PCI (PCI is
an initialism formed from Peripheral Component Interconnect,[1] part of
the PCI Local Bus standard and often shortened to PCI) is a computer
bus for attaching hardware devices in a computer. These
devices can take either the form of an integrated circuit fitted onto the motherboard
itself, called a planar device in the PCI specification, or an expansion
card that fits into a slot. The PCI Local Bus is common in modern PCs,
where it has displaced ISA and VESA
Local Bus as the standard expansion bus, and it also appears in many other
computer types. Despite the availability of faster interfaces such as PCI-X and PCI Express,
conventional PCI remains a very common interface.
The PCI
specification covers the physical size of the bus (including the size and
spacing of the circuit board edge electrical contacts), electrical
characteristics, bus timing, and protocols. The specification can be purchased
from the PCI
Special Interest Group (PCI-SIG).
Typical PCI
cards used in PCs include: network cards, sound cards,
modems, extra
ports such as USB or serial,
TV
tuner cards and disk controllers. Historically video cards
were typically PCI devices, but growing bandwidth requirements soon outgrew the
capabilities of PCI. PCI video cards remain available for supporting extra
monitors and upgrading PCs that do not have any AGP or PCI Express slots.[2]
Many devices
traditionally provided on expansion cards are now commonly integrated onto the
motherboard itself, meaning that modern PCs often have no cards fitted.
However, PCI is still used for certain specialized cards, although many tasks
traditionally performed by expansion cards may now be performed equally well by
USB devices.
History
Work on PCI
began at Intel's Architecture Development Lab circa 1990.
A team of Intel
engineers (composed primarily of ADL engineers) defined the architecture and
developed a proof of concept chipset and platform (Saturn) partnering with
teams in the company's desktop PC systems and core logic product organizations.
The original PCI architecture team included, among others, Dave Carson, Norm
Rasmussen, Brad Hosler, Ed Solari, Bruce Young, Gary Solomon, Ali Oztaskin, Tom
Sakoda, Rich Haslam, Jeff Rabe, and Steve Fischer.
PCI (Peripheral
Component Interconnect) was immediately put to use in servers, replacing MCA and EISA as the server
expansion bus of choice. In mainstream PCs, PCI was slower to replace VESA
Local Bus (VLB), and did not gain significant market penetration until late
1994 in second-generation Pentium PCs. By 1996 VLB was all but
extinct, and manufacturers had adopted PCI even for 486
computers.[3]
EISA continued to be used alongside PCI through 2000. Apple
Computer adopted PCI for professional Power
Macintosh computers (replacing NuBus) in mid-1995, and the consumer Performa product line (replacing LC PDS) in mid-1996.
Later revisions
of PCI added new features and performance improvements, including a 66 MHz 3.3 V standard and
133 MHz PCI-X,
and the adaptation of PCI signaling to other form factors. Both PCI-X 1.0b
and PCI-X 2.0 are backward compatible with some PCI standards.
The PCI-SIG
introduced the serial PCI Express in 2004. At the same time they renamed PCI
as Conventional PCI. Since then, motherboard manufacturers have included
progressively fewer Conventional PCI slots in favor of the new standard.
PCI
History[4] |
||
Spec |
Year |
Change
Summary[5] |
PCI 1.0 |
1992 |
Original issue |
PCI 2.0 |
1993 |
Incorporated connector and add-in
card specification |
PCI 2.1 |
1995 |
Incorporated clarifications and
added 66 MHz chapter |
PCI 2.2 |
1998 |
Incorporated ECNs, and improved
readability |
PCI 2.3 |
2002 |
Incorporated ECNs, errata, and
deleted 5 volt only keyed add-in cards |
PCI 3.0 |
2002 |
Removed support for the 5.0 volt
keyed system board connector |
Auto Configuration
PCI provides
separate memory and I/O port address
spaces for the x86
processor family, 64
and 32 bits,
respectively. Addresses in these address spaces are assigned by software. A
third address space, called the PCI Configuration Space, which uses a fixed
addressing scheme, allows software to determine the amount of memory and I/O
address space needed by each device. Each device can request up to six areas of
memory space or I/O port space via its configuration space registers.
In a typical
system, the firmware
(or operating system) queries all PCI buses at startup
time (via PCI Configuration Space) to find out what
devices are present and what system resources (memory space, I/O space,
interrupt lines, etc.) each needs. It then allocates the resources and tells
each device what its allocation is.
The PCI
configuration space also contains a small amount of device type information,
which helps an operating system choose device drivers for it, or at least to
have a dialogue with a user about the system configuration.
Devices may
have an on-board ROM containing executable code for x86 or PA-RISC
processors, an Open Firmware driver, or an EFI driver. These are typically
necessary for devices used during system startup, before device drivers are
loaded by the operating system.
In addition
there are PCI Latency Timers that are a mechanism for PCI
Bus-Mastering devices to share the PCI bus fairly. "Fair" in this
case means that devices won't use such a large portion of the available PCI bus
bandwidth that other devices aren't able to get needed work done. Note, this
does not apply to PCI Express.
How this works is that each PCI device
that can operate in bus-master mode is required to implement a timer, called
the Latency Timer, that limits the time that device can hold the PCI bus. The
timer starts when the device gains bus ownership, and counts down at the rate
of the PCI clock. When the counter reaches zero, the device is required to
release the bus. If no other devices are waiting for bus ownership, it may
simply grab the bus again and transfer more data.[6]
Interrupts
Devices are
required to follow a protocol so that the interrupt
lines can be shared. The PCI bus includes four interrupt lines, all of which
are available to each device. However, they are not wired in parallel as are
the other PCI bus lines. The positions of the interrupt lines rotate between
slots, so what appears to one device as the INTA# line is INTB# to the next and
INTC# to the one after that. Single-function devices use their INTA# for
interrupt signaling, so the device load is spread fairly evenly across the four
available interrupt lines. This alleviates a common problem with sharing
interrupts.
PCI bridges
(between two PCI buses) map the four interrupt traces on each of their sides in
varying ways. Some bridges use a fixed mapping, and in others it is
configurable. In the general case, software cannot determine which interrupt
line a device's INTA# pin is connected to across a bridge. The mapping of PCI
interrupt lines onto system interrupt lines, through the PCI host bridge, is
similarly implementation-dependent. The result is that it can be impossible to
determine how a PCI device's interrupts will appear to software.
Platform-specific BIOS code is meant to know this, and set a field in each
device's configuration space indicating which IRQ it is connected to, but this
process is not reliable.
PCI interrupt
lines are level-triggered. This was chosen over edge-triggering
in order to gain an advantage when servicing a shared interrupt line, and for
robustness: edge triggered interrupts are easy to miss.
Later revisions
of the PCI specification add support for message-signaled interrupts. In this
system a device signals its need for service by performing a memory write,
rather than by asserting a dedicated line. This alleviates the problem of
scarcity of interrupt lines. Even if interrupt vectors are still shared, it
does not suffer the sharing problems of level-triggered interrupts. It also
resolves the routing problem, because the memory write is not unpredictably
modified between device and host. Finally, because the message signaling is in-band, it resolves some synchronization
problems that can occur with posted writes and out-of-band
interrupt lines.
PCI Express
does not have physical interrupt lines at all. It uses message-signaled
interrupts exclusively.
Conventional hardware specifications
Diagram showing
the different key positions for 32-bit and 64-bit PCI cards
These
specifications represent the most common version of PCI used in normal PCs.
- 33.33 MHz
clock
with synchronous transfers
- peak
transfer rate of 133 MB/s (133 megabytes
per second) for 32-bit bus width (33.33 MHz × 32 bits ÷
8 bits/byte = 133 MB/s)
- 32-bit bus
width
- 32- or
64-bit memory address space (4 gigabytes
or 16 exabytes)
- 32-bit I/O
port space
- 256-byte (per device)
configuration space
- 5-volt
signaling
- reflected-wave switching
The PCI
specification also provides options for 3.3 V signaling, 64-bit bus width,
and 66 MHz clocking, but these are not commonly encountered outside of
PCI-X support on server motherboards.
The PCI bus
arbiter performs bus arbitration among multiple masters on the PCI bus. Any
number of bus masters can reside on the PCI bus, as well as requests for the
bus. One pair of request and grant signals is dedicated to each bus master.
Card keying
A PCI-X Gigabit
Ethernet expansion card. Note both 5 V and 3.3 V support notches
are present.
Typical PCI
cards present either one or two key notches, depending on their signaling
voltage. Cards requiring 3.3 volts have a notch 56.21 mm from the
front of the card (where the external connectors are) while those requiring
5 volts have a notch 104.47 mm from the front of the card. So called
"Universal cards" have both key notches and can accept both types of
signal.
Connector pinout
The PCI
connector is defined as having 62 contacts on each side of the edge
connector, but two or four of them are replaced by key notches, so a card
has 60 or 58 contacts on each side. Pin 1 is closest to the backplate. B and A
sides are as follows, looking down into the motherboard connector.
32-bit
PCI connector pinout |
|||||
Pin |
Side
B |
Side
A |
Comments |
||
1 |
−12V |
TRST# |
JTAG port pins (optional) |
||
2 |
TCK |
+12V |
|||
3 |
Ground |
TMS |
|||
4 |
TDO |
TDI |
|||
5 |
+5V |
+5V |
|||
6 |
+5V |
INTA# |
Interrupt lines (open-drain) |
||
7 |
INTB# |
INTC# |
|||
8 |
INTD# |
+5V |
|||
9 |
PRSNT1# |
Reserved |
Pulled low to indicate 7.5 or 25 W
power required |
||
10 |
Reserved |
IOPWR |
+5V or +3.3V |
||
11 |
PRSNT2# |
Reserved |
Pulled low to indicate 7.5 or 15 W
power required |
||
12 |
Ground |
Ground |
Key notch for 3.3V-capable cards |
||
13 |
Ground |
Ground |
|||
14 |
Reserved |
3.3Vaux |
Standby
power (optional) |
||
15 |
Ground |
RST# |
Bus reset |
||
16 |
CLK |
IOPWR |
33/66 MHz clock |
||
17 |
Ground |
GNT# |
Bus grant from motherboard to card |
||
18 |
REQ# |
Ground |
Bus request from card to
motherboard |
||
19 |
IOPWR |
PME# |
Power management event (optional) |
||
20 |
AD[31] |
AD[30] |
Address/data bus (upper half) |
||
21 |
AD[29] |
+3.3V |
|||
22 |
Ground |
AD[28] |
|||
23 |
AD[27] |
AD[26] |
|||
24 |
AD[25] |
Ground |
|||
25 |
+3.3V |
AD[24] |
|||
26 |
C/BE[3]# |
IDSEL |
|||
27 |
AD[23] |
+3.3V |
|||
28 |
Ground |
AD[22] |
|||
29 |
AD[21] |
AD[20] |
|||
30 |
AD[19] |
Ground |
|||
31 |
+3.3V |
AD[18] |
|||
32 |
AD[17] |
AD[16] |
|||
33 |
C/BE[2]# |
+3.3V |
|||
34 |
Ground |
FRAME# |
Bus transfer in progress |
||
35 |
IRDY# |
Ground |
Initiator ready |
||
36 |
+3.3V |
TRDY# |
Target ready |
||
37 |
DEVSEL# |
Ground |
Target selected |
||
38 |
Ground |
STOP# |
Target requests halt |
||
39 |
LOCK# |
+3.3V |
Locked transaction |
||
40 |
PERR# |
SMBCLK |
SDONE |
Parity error; SMBus clock or Snoop
done (obsolete) |
|
41 |
+3.3V |
SMBDAT |
SBO# |
SMBus data or Snoop backoff
(obsolete) |
|
42 |
SERR# |
Ground |
System error |
||
43 |
+3.3V |
PAR |
Even parity over AD[31:00] and
C/BE[3:0]# |
||
44 |
C/BE[1]# |
AD[15] |
Address/data bus (lower half) |
||
45 |
AD[14] |
+3.3V |
|||
46 |
Ground |
AD[13] |
|||
47 |
AD[12] |
AD[11] |
|||
48 |
AD[10] |
Ground |
|||
49 |
M66EN |
Ground |
AD[09] |
||
50 |
Ground |
Ground |
Key notch for 5V-capable cards |
||
51 |
Ground |
Ground |
|||
52 |
AD[08] |
C/BE[0]# |
Address/data bus (lower half) |
||
53 |
AD[07] |
+3.3V |
|||
54 |
+3.3V |
AD[06] |
|||
55 |
AD[05] |
AD[04] |
|||
56 |
AD[03] |
Ground |
|||
57 |
Ground |
AD[02] |
|||
58 |
AD[01] |
AD[00] |
|||
59 |
IOPWR |
IOPWR |
|||
60 |
ACK64# |
REQ64# |
For 64-bit extension; no connect
for 32-bit devices. |
||
61 |
+5V |
+5V |
|||
62 |
+5V |
+5V |
|||
64-bit PCI
extends this by an additional 32 contacts on each side which provide AD[63:32],
C/BE[7:4]#, the PAR64 parity signal, and a number of power and ground pins.
Legend |
|
Ground
pin |
Zero volt reference |
Power
pin |
Supplies power to the PCI card |
Output
pin |
Driven by the PCI card, received
by the motherboard |
Initiator
output |
Driven by the master/initiator,
received by the target |
I/O
signal |
May be driven by initiator or
target, depending on operation |
Target
output |
Driven by the target, received by
the initiator/master |
Input |
Driven by the motherboard,
received by the PCI card |
May be pulled low and/or sensed by
multiple cards |
|
Reserved |
Not presently used, do not connect |
Most lines are
connected to each slot in parallel. The exceptions are:
- Each slot
has its own REQ# output to, and GNT# input from the motherboard arbiter.
- Each slot
has its own IDSEL line, usually connected to a specific AD line.
- TDO is
daisy-chained to the following slot's TDI. Cards without JTAG support must connect TDI to TDO
so as not to break the chain.
- PRSNT1#
and PRSNT2# for each slot have their own pull-up resistors on the
motherboard. The motherboard may (but does not have to) sense these pins
to determine the presence of PCI cards and their power requirements.
- REQ64# and
ACK64# are individually pulled up on 32-bit only slots.
- The
interrupt lines INTA# through INTD# are connected to all slots in
different orders. (INTA# on one slot is INTB# on the next and INTC# on the
one after that.)
Notes:
- IOPWR is
+3.3V or +5V, depending on the backplane. The slots also have a ridge in
one of two places which prevents insertion of cards that do not have the
corresponding key notch, indicating support for that voltage standard.
Universal cards have both key notches and use IOPWR to determine their I/O
signal levels.
- The PCI
SIG strongly encourages 3.3 V PCI signaling, requiring support for it
since standard revision 2.3, but most PC motherboards use the 5 V
variant. Thus, while many currently available PCI cards support both, and
have two key notches to indicate that, there are still a large number of
5 V-only cards on the market.
- The M66EN
pin is an additional ground on 5V PCI buses found in most PC motherboards.
Cards and motherboards that do not support 66 MHz operation also
ground this pin. If all participants support 66 MHz operation, a
pull-up resistor on the motherboard raises this signal high and 66 MHz
operation is enabled.
- At least
one of PRSNT1# and PRSNT2# must be grounded by the card. The combination
chosen indicates the total power requirements of the card (25 W,
15 W, or 7.5 W).
- SBO# and
SDONE are signals from a cache controller to the current target. They are
not initiator outputs, but are colored that way because they are target
inputs.
Physical card dimensions
Full-size card
The original
"full-size" PCI card is specified as a height of 107 mm (4.2 inches) and a depth of
312 mm (12.283 inches). The height includes the edge card connector.
However, most modern PCI cards are half-length or smaller (see below) and many
modern PCs cannot fit a full-size card.
Card backplate
In addition to
these dimensions the physical size and location of a card's backplate are also
standardized. The backplate is the part that fastens to the card cage to
stabilize the card and also contains external connectors, so it usually
attaches in a window so it is accessible from outside the computer case. The
backplate is fixed to the cage by a 6-32 screw.
The card itself
can be a smaller size, but the backplate must still be full-size and properly
located so that the card fits in any standard PCI slot.
Half-length extension card (de-facto standard)
This is in fact
the practical standard now – the majority of modern PCI cards fit inside
this length.
- Width:
0.6 inches (15.24 mm)
- Depth: 6.9 inches
(175.26 mm)
- Height:
4.2 inches (106.68 mm)
Low-profile (half-height) card
The PCI
organization has defined a standard for "low-profile" cards, which
basically fit in the following ranges:
- Height:
1.42 inches (36.07 mm) to 2.536 inches (64.41 mm)
- Depth:
4.721 inches (119.91 mm) to 6.6 inches (167.64 mm)
The bracket is
also reduced in height, to a standard 3.118 inches (79.2 mm). The
smaller bracket will not fit a standard PC case, but will fit in a 2U rack-mount
case. Many manufacturers supply both types of bracket (brackets are typically
screwed to the card so changing them is not difficult).
These cards may
be known by other names such as "slim".
Mini PCI
Mini PCI Wi-Fi card Type IIIB
Mini PCI was added to PCI version 2.2 for use
in laptops; it
uses a 32-bit, 33 MHz bus with powered connections (3.3 V only;
5 V is limited to 100 mA) and support for bus
mastering and DMA. The standard size for Mini PCI cards is
approximately 1/4 of their full-sized counterparts. As there is limited
external access to the card compared to desktop PCI cards, there are
limitations on the functions they may perform.
PCI-to-MiniPCI
converter Type III
MiniPCI and
MiniPCI Express cards in comparison
Many Mini PCI
devices were developed such as Wi-Fi, Fast Ethernet, Bluetooth, modems (often Winmodems), sound cards,
cryptographic accelerators, SCSI, IDE–ATA,
SATA controllers and
combination cards. Mini PCI cards can be used with regular PCI-equipped
hardware, using Mini PCI-to-PCI converters. Mini PCI has been superseded
by PCI Express Mini Card.
Technical details of Mini PCI
Mini PCI cards
have a 2 W maximum power consumption, which also limits the functionality
that can be implemented in this form factor. They also are required to support
the CLKRUN# PCI signal used to start and stop the PCI clock for power
management purposes.
There are three
card form factors: Type I, Type II, and Type III
cards. The card connector used for each type include: Type I and II use a
100-pin stacking connector, while Type III uses a 124-pin edge connector, i.e.
the connector for Types I and II differs from that for Type III, where the
connector is on the edge of a card, like with a SO-DIMM. The
additional 24 pins provide the extra signals required to route I/O
back through the system connector (audio, AC-Link, LAN, phone-line
interface). Type II cards have RJ11 and RJ45 mounted connectors. These cards
must be located at the edge of the computer or docking station so that the RJ11
and RJ45 ports can be mounted for external access.
Type |
Card
on outer edge of host system |
Connector |
Size |
Comments |
IA |
No |
100-Pin Stacking |
7.5 × 70 × 45 mm |
Large Z dimension (7.5 mm) |
IB |
No |
100-Pin Stacking |
5.5 × 70 × 45 mm |
Smaller Z dimension (5.5 mm) |
IIA |
Yes |
100-Pin Stacking |
17.44 × 70 × 45 mm |
Large Z dimension (17.44 mm) |
IIB |
Yes |
100-Pin Stacking |
5.5 × 78 × 45 mm |
Smaller Z dimension (5.5 mm) |
IIIA |
No |
124-Pin Card Edge |
2.4 × 59.6 × 50.95 mm |
Larger Y dimension (50.95 mm) |
IIIB |
No |
124-Pin Card Edge |
2.4 × 59.6 × 44.6 mm |
Smaller Y dimension (44.6 mm) |
Mini PCI should
not be confused with 144-pin Micro PCI. [7]
Other physical variations
Typically
consumer systems specify "N × PCI slots" without specifying actual
dimensions of the space available. In some small-form-factor systems, this may
not be sufficient to allow even "half-length" PCI cards to fit.
Despite this limitation, these systems are still useful because many modern PCI
cards are considerably smaller than half-length.
PCI bus transactions
PCI bus traffic
is made of a series of PCI bus transactions. Each transaction is made up of an address
phase followed by one or more data phases. The direction of the data
phases may be from initiator to target (write transaction) or vice-versa (read
transaction), but all of the data phases must be in the same direction. Either
party may pause or halt the data phases at any point. (One common example is a
low-performance PCI device that does not support burst transactions, and always
halts a transaction after the first data phase.)
Any PCI device
may initiate a transaction. First, it must request permission from a PCI bus
arbiter on the motherboard. The arbiter grants permission to one of the
requesting devices. The initiator begins the address phase by broadcasting a
32-bit address plus a 4-bit command code, then waits for a target to respond.
All other devices examine this address and one of them responds a few cycles
later.
64-bit
addressing is done using a two-stage address phase. The initiator broadcasts
the low 32 address bits, accompanied by a special "dual address
cycle" command code. Devices which do not support 64-bit addressing can
simply not respond to that command code. The next cycle, the initiator
transmits the high 32 address bits, plus the real command code. The transaction
operates identically from that point on. To ensure compatibility with 32-bit
PCI devices, it is forbidden to use a dual address cycle if not necessary, i.e.
if the high-order address bits are all zero.
While the PCI
bus transfers 32 bits per data phase, the initiator transmits a 4-bit byte mask
indicating which 8-bit bytes are to be considered significant. In particular, a
masked write must affect only the desired bytes in the target PCI device.
PCI address spaces
PCI has three
address spaces: memory, I/O address, and configuration.
Memory
addresses are 32 bits (optionally 64 bits) in size, support caching and can be
burst transactions.
I/O addresses
are for compatibility with the Intel x86
architecture's I/O port address space. Although the PCI bus specification
allows burst transactions in any address space, most devices only support it
for memory addresses and not I/O.
Finally, PCI configuration space provides access to
256 bytes of special configuration registers per PCI device. Each PCI slot gets
its own configuration space address range. The registers are used to configure
devices memory and I/O address ranges they should respond to from transaction
initiators. When a computer is first turned on, all PCI devices respond only to
their configuration space accesses. The computers BIOS scans for devices and
assigns Memory and I/O address ranges to them.
If an address
is not claimed by any device, the transaction initiator's address phase will
time out causing the initiator to abort the operation. In case of reads, it is
customary to supply all-ones for the read data value (0xFFFFFFFF) in this case.
PCI devices therefore generally attempt to avoid using the all-ones value in
important status registers, so that such an error can be easily detected by
software.
PCI command codes
There are 16
possible 4-bit command codes, and 12 of them are assigned. With the exception
of the unique dual address cycle, the least significant bit of the command code
indicates whether the following data phases are a read (data sent from target
to initiator) or a write (data sent from an initiator to target). PCI targets
must examine the command code as well as the address and not respond to address
phases which specify an unsupported command code.
The commands
that refer to cache lines depend on the PCI configuration space cache line size
register being set up properly; they may not be used until that has been done.
0000: Interrupt
Acknowledge
This is a special form of read cycle
implicitly addressed to the interrupt controller, which returns an interrupt
vector. The 32-bit address field is ignored. One possible implementation is to
generate an interrupt acknowledge cycle on an ISA bus using a PCI/ISA bus
bridge. This command is for IBM
PC compatibility; if there is no Intel 8259
style interrupt controller on the PCI bus, this cycle need never be used.
0001: Special
Cycle
This cycle is a special broadcast write
of system events that PCI card may be interested in. The address field of a
special cycle is ignored, but it is followed by a data phase containing a
payload message. The currently defined messages announce that the processor is
stopping for some reason (e.g. to save power). No device ever responds to this
cycle; it is always terminated with a master abort after leaving the data on
the bus for at least 4 cycles.
0010: I/O Read
This performs a read from I/O space.
All 32 bits of the read address are provided, so that a device can (for
compatibility reasons) implement less than 4 bytes worth of I/O registers. If
the byte enables request data not within the address range supported by the PCI
device (e.g. a 4-byte read from a device which only supports 2 bytes of I/O
address space), it must be terminated with a target abort. Multiple data cycles
are permitted, using linear (simple incrementing) burst ordering.
The PCI standard is discouraging the
use of I/O space in new devices, preferring that as much as possible be done
through main memory mapping.
0011: I/O Write
This performs a write to I/O space.
010x:
Reserved
A PCI device must not respond to an
address cycle with these command codes.
0110: Memory
Read
This performs a read cycle from memory
space. Because the smallest memory space a PCI device is permitted to implement
is 16 bits, the two least significant bits of the address are not needed;
equivalent information will arrive in the form of byte select signals. They
instead specify the order in which burst data must be returned. If a device
does not support the requested order, it must provide the first word and then
disconnect.
If a memory space is marked as
"prefetchable", then the target device must ignore the byte select
signals on a memory read and always return 32 valid bits.
0111: Memory
Write
This operates similarly to a memory
read. The byte select signals are more important in a write, as unselected
bytes must not be written to memory.
Generally, PCI writes are faster than
PCI reads, because a device can buffer the incoming write data and release the
bus faster. For a read, it must delay the data phase until the data has been
fetched.
100x:
Reserved
A PCI device must not respond to an
address cycle with these command codes.
1010:
Configuration Read
This is similar to an I/O read, but
reads from PCI configuration space. A device must respond only if the low 11
bits of the address specify a function and register that it implements, and if
the special IDSEL signal is asserted. It must ignore the high 21 bits. Burst
reads (using linear incrementing) are permitted in PCI configuration space.
Unlike I/O space, standard PCI configuration
registers are defined so that reads never disturb the state of the device. It
is possible for a device to have configuration space registers beyond the
standard 64 bytes which have read side effects, but this is rare.[8]
Configuration space accesses often have
a few cycles of delay in order to allow the IDSEL lines to stabilize, which
makes them slower than other forms of access. Also, a configuration space
access requires a multi-step operation rather than a single machine
instruction. Thus, it is best to avoid them during routine operation of a PCI
device.
1011:
Configuration Write
This operates analogously to a
configuration read.
1100: Memory
Read Multiple
This command is identical to a generic
memory read, but includes the hint that a long read burst will continue beyond
the end of the current cache line, and the target should internally prefetch a large
amount of data. A target is always permitted to consider this a synonym for a
generic memory read.
1101: Dual
Address Cycle
When accessing a memory address that
requires more than 32 bits to represent, the address phase begins with this
command and the low 32 bits of the address, followed by a second cycle with the
actual command and the high 32 bits of the address. PCI targets that do not
support 64-bit addressing can simply treat this as another reserved command
code and not respond to it. This command code can only be used with a non-zero
high-order address word; it is forbidden to use this cycle if not necessary.
1110: Memory
Read Line
This command is identical to a generic
memory read, but includes the hint that the read will continue to the end of
the cache line. A target is always permitted to consider this a synonym for a
generic memory read.
1111: Memory
Write and Invalidate
This command is identical to a generic
memory write, but comes with the guarantee that one or more whole cache lines
will be written, with all byte selects enabled. This is an optimization for
write-back caches snooping the bus. Normally, a write-back cache holding dirty
data must interrupt the write operation long enough write its own dirty data
first. If the write is performed using this command, the data to be written
back is guaranteed to be irrelevant, and can simply be invalidated in the
write-back cache.
This optimization only affects the
snooping cache, and makes no difference to the target, which may treat this as
a synonym for the memory write command.
PCI bus signals
PCI bus
transactions are controlled by five main control signals, two driven by the
initiator of a transaction (FRAME# and IRDY#), and three driven by the target
(DEVSEL#, TRDY#, and STOP#). There are two additional arbitration signals (REQ#
and GNT#) which are used to obtain permission to initiate a transaction. All
are active-low,
meaning that the active or asserted state is a low voltage. Pull-up
resistors on the motherboard ensure they will remain high (inactive or deasserted)
if not driven by any device, but the PCI bus does not depend on the resistors
to change the signal level; all devices drive the signals high for one
cycle before ceasing to drive the signals.
Signal timing
All PCI bus
signals are sampled on the rising edge of the clock. Signals nominally change
on the falling edge of the clock, giving each PCI device approximately one half
a clock cycle to decide how to respond to the signals it observed on the rising
edge, and one half a clock cycle to transmit its response to the other device.
The PCI bus
requires that every time the device driving a PCI bus signal changes, one turnaround
cycle must elapse between the time the one device stops driving the signal
and the other device starts. Without this, there might be a period when both
devices were driving the signal, which would interfere with bus operation.
The combination
of this turnaround cycle and the requirement to drive a control line high for
one cycle before ceasing to drive it means that each of the main control lines
must be high for a minimum of two cycles when changing owners. The PCI bus
protocol is designed so this is rarely a limitation; only in a few special
cases (notably fast
back-to-back transactions) is it necessary to insert additional delay to
meet this requirement.
Arbitration
Any device on a
PCI bus that is capable of acting as a bus master
may initiate a transaction with any other device. To ensure that only one
transaction is initiated at a time, each master must first wait for a bus grant
signal, GNT#, from an arbiter located on the motherboard. Each device has a
separate request line REQ# that requests the bus, but the arbiter may
"park" the bus grant signal at any device if there are no current
requests.
The arbiter may
remove GNT# at any time. A device which loses GNT# may complete its current
transaction, but may not start one (by asserting FRAME#) unless it observes
GNT# asserted the cycle before it begins.
The arbiter may
also provide GNT# at any time, including during another master's transaction.
During a transaction, either FRAME# or IRDY# or both are asserted; when both
are deasserted, the bus is idle. A device may initiate a transaction at any
time that GNT# is asserted and the bus is idle.
Address phase
A PCI bus
transaction begins with an address phase. The initiator, seeing that it
has GNT# and the bus is idle, drives the target address onto the AD[31:0]
lines, the associated command (e.g. memory read, or I/O write) on the
C/BE[3:0]# lines, and pulls FRAME# low.
Each other
device examines the address and command and decides whether to respond as the
target by asserting DEVSEL#. A device must respond by asserting DEVSEL# within
3 cycles. Devices which promise to respond within 1 or 2 cycles are said to
have "fast DEVSEL" or "medium DEVSEL", respectively.
(Actually, the time to respond is 2.5 cycles, since PCI devices must transmit
all signals half a cycle early so that they can be received three cycles
later.)
Note that a
device must latch the address on the first cycle; the
initiator is required to remove the address and command from the bus on the
following cycle, even before receiving a DEVSEL# response. The additional time
is available only for interpreting the address and command after it is
captured.
On the fifth
cycle of the address phase (or earlier if all other devices have medium DEVSEL
or faster), a catch-all "subtractive decoding" is allowed for some
address ranges. This is commonly used by an ISA bus bridge
for addresses within its range (24 bits for memory and 16 bits for I/O).
On the sixth
cycle, if there has been no response, the initiator may abort the transaction
by deasserting FRAME#. This is known as master abort termination and it
is customary for PCI bus bridges to return all-ones data (0xFFFFFFFF) in this
case. PCI devices therefore are generally designed to avoid using the all-ones
value in important status registers, so that such an error can be easily
detected by software.
Address phase timing
_
0_ 1_ 2_
3_ 4_ 5_
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/
___
GNT# \___/XXXXXXXXXXXXXXXXXXX
(GNT# Irrelevant after cycle has started)
_______
FRAME# \___________________
___
AD[31:0] -------<___>--------------- (Address only valid for one
cycle.)
___ _______________
C/BE[3:0]# -------<___X_______________
(Command, then first data phase byte enables)
_______________________
DEVSEL# \___\___\___\___
Fast Med Slow Subtractive
_ _
_ _ _
_ _
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/
0 1
2 3 4 5
On the rising
edge of clock 0, the initiator observes FRAME# and IRDY# both high, and GNT#
low, so it drives the address, command, and asserts FRAME# in time for the
rising edge of clock 1. Targets latch the address and begin decoding it. They
may respond with DEVSEL# in time for clock 2 (fast DEVSEL), 3 (medium) or 4
(slow). Subtractive decode devices, seeing no other response by clock 4, may
respond on clock 5. If the master does not see a response by clock 5, it will
terminate the transaction and remove FRAME# on clock 6.
TRDY# and STOP#
are deasserted (high) during the address phase. The initiator may assert IRDY#
as soon as it is ready to transfer data, which could theoretically be as soon
as clock 2.
Dual-cycle address
To allow 64-bit
addressing, a master will present the address over two consecutive cycles.
First, it sends the low-order address bits with a special "dual-cycle
address" command on the C/BE[3:0]#. On the following cycle, it sends the
high-order address bits and the actual command. Dual-address cycles are
forbidden if the high-order address bits are zero, so devices which do not
support 64-bit addressing can simply not respond to dual cycle commands.
_
0_ 1_ 2_
3_ 4_ 5_ 6_
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/
___
GNT#
\___/XXXXXXXXXXXXXXXXXXXXXXX
_______
FRAME#
\_______________________
___ ___
AD[31:0] -------<___X___>--------------- (Low, then high bits)
___ ___ _______________
C/BE[3:0]# -------<___X___X_______________
(DAC, then actual command)
___________________________
DEVSEL#
\___\___\___\___
Fast Med Slow
_ _
_ _ _
_ _ _
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/
0 1
2 3 4
5 6
Configuration access
Addresses for
PCI configuration space access are decoded specially. For these, the low-order
address lines specify the offset of the desired PCI configuration register, and
the high-order address lines are ignored. Instead, an additional address
signal, the IDSEL input, must be high before a device may assert DEVSEL#. Each
slot connects a different high-order address line to the IDSEL pin, and is
selected using one-hot
encoding on the upper address lines.
Data phases
After the
address phase (specifically, beginning with the cycle that DEVSEL# goes low)
comes a burst of one or more data phases. In all cases, the initiator
drives active-low byte select signals on the C/BE[3:0]# lines, but the data on
the AD[31:0] may be driven by the initiator (in case of writes) or target (in
case of reads).
During data
phases, the C/BE[3:0]# lines are interpreted as active-low byte enables.
In case of a write, the asserted signals indicate which of the four bytes on
the AD bus are to be written to the addressed location. In the case of a read,
they indicate which bytes the initiator is interested in. For reads, it is
always legal to ignore the byte enable signals and simply return all 32 bits; cacheable
memory resources are required to always return 32 valid bits. The byte enables
are mainly useful for I/O space accesses where reads have side effects.
A data phase
with all four C/BE# lines deasserted is explicitly permitted by the PCI
standard, and must have no effect on the target (other than to advance the
address in the burst access in progress).
The data phase
continues until both parties are ready to complete the transfer and continue to
the next data phase. The initiator asserts IRDY# (initiator ready) when
it no longer needs to wait, while the target asserts TRDY# (target ready).
Whichever side is providing the data must drive it on the AD bus before
asserting its ready signal.
Once one of the
participants asserts its ready signal, it may not become un-ready or otherwise
alter its control signals until the end of the data phase. The data recipient
must latch the AD bus each cycle until it sees both IRDY# and TRDY# asserted,
which marks the end of the current data phase and indicates that the just-latched
data is the word to be transferred.
To maintain
full burst speed, the data sender then has half a clock cycle after seeing both
IRDY# and TRDY# asserted to drive the next word onto the AD bus.
0_
1_ 2_ 3_
4_ 5_ 6_
7_ 8_ 9_
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/
___ _______ ___ ___ ___
AD[31:0] ---<___XXXXXXXXX_______XXXXX___X___X___ (If a write)
___ ___ _______ ___ ___
AD[31:0] ---<___>~~~<XXXXXXXX___X_______X___X___ (If a read)
___ _______________ _______ ___
___
C/BE[3:0]#
---<___X_______________X_______X___X___ (Must always be valid)
_______________ |
___ | | |
IRDY# x \_______/ x
\___________
___________________ |
| | |
TRDY# x x \___________________
___________ | |
| |
DEVSEL#
\___________________________
___ | |
| |
FRAME#
\___________________________________
_ _
_ _ _
|_ _ |_
|_ |_
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/
0
1 2 3
4 5 6
7 8 9
This continues
the address cycle illustrated above, assuming a single address cycle with
medium DEVSEL, so the target responds in time for clock 3. However, at that
time, neither side is ready to transfer data. For clock 4, the initiator is
ready, but the target is not. On clock 5, both are ready, and a data transfer
takes place (as indicated by the vertical lines). For clock 6, the target is
ready to transfer, but the initator is not. On clock 7, the initiator becomes
ready, and data is transferred. For clocks 8 and 9, both sides remain ready to
transfer data, and data is transferred at the maximum possible rate (32 bits
per clock cycle).
In case of a
read, clock 2 is reserved for turning around the AD bus, so the target is not
permitted to drive data on the bus even if it is capable of fast DEVSEL.
Fast DEVSEL# on reads
A target that
supports fast DEVSEL could in theory begin responding to a read the cycle after
the address is presented. This cycle is, however, reserved for AD bus
turnaround. Thus, a target may not drive the AD bus (and thus may not assert
TRDY#) on the second cycle of a transaction. Note that most targets will not be
this fast and will not need any special logic to enforce this condition.
Ending transactions
Either side may
request that a burst end after the current data phase. Simple PCI devices that
do not support multi-word bursts will always request this immediately. Even
devices that do support bursts will have some limit on the maximum length they
can support, such as the end of their addressable memory.
Initiator burst termination
The initiator
can mark any data phase as the final one in a transaction by deasserting FRAME#
at the same time as it asserts IRDY#. The cycle after the target asserts TRDY#,
the final data transfer is complete, both sides deassert their respective RDY#
signals, and the bus is idle again. The master may not deassert FRAME# before
asserting IRDY#, nor may it deassert FRAME# while waiting, with IRDY# asserted,
for the target to assert TRDY#.
The only minor
exception is a master abort termination, when no target responds with
DEVSEL#. Obviously, it is pointless to wait for TRDY# in such a case. However,
even in this case, the master must assert IRDY# for at least one cycle after
deasserting FRAME#. (Commonly, a master will assert IRDY# before receiving DEVSEL#,
so it must simply hold IRDY# asserted for one cycle longer.) This is to ensure
that bus turnaround timing rules are obeyed on the FRAME# line.
Target burst termination
The target
requests the initiator end a burst by asserting STOP#. The initiator will then
end the transaction by deasserting FRAME# at the next legal opportunity; if it
wishes to transfer more data, it will continue in a separate transaction. There
are several ways for the target to do this:
Disconnect with
data
If the target asserts STOP# and TRDY#
at the same time, this indicates that the target wishes this to be the last
data phase. For example, a target that does not support burst transfers will
always do this to force single-word PCI transactions. This is the most
efficient way for a target to end a burst.
Disconnect
without data
If the target asserts STOP# without
asserting TRDY#, this indicates that the target wishes to stop without
transferring data. STOP# is considered equivalent to TRDY# for the purpose of
ending a data phase, but no data is transferred.
Retry
A Disconnect without data before
transferring any data is a retry, and unlike other PCI transactions, PCI
initiators are required to pause slightly before continuing the operation. See
the PCI specification for details.
Target abort
Normally, a target holds DEVSEL#
asserted through the last data phase. However, if a target deasserts DEVSEL#
before disconnecting without data (asserting STOP#), this indicates a target
abort, which is a fatal error condition. The initiator may not retry, and
typically treats it as a bus error. Note that a target may not deassert DEVSEL#
while waiting with TRDY# or STOP# low; it must do this at the beginning of a
data phase.
There will
always be at least one more cycle after a target-initiated disconnection, to
allow the master to deassert FRAME#. There are two sub-cases, which take the
same amount of time, but one requires an additional data phase:
Disconnect-A
If the initiator observes STOP# before
asserting its own IRDY#, then it can end the burst by deasserting FRAME# at the
end of the current data phase.
Disconnect-B
If the initiator has already asserted
IRDY# (without deasserting FRAME#) by the time it observes the target's STOP#,
it is already committed to an additional data phase. The target must wait
through an additional data phase, holding STOP# asserted without TRDY#, before
the transaction can end.
If the
initiator ends the burst at the same time as the target requests disconnection,
there is no additional bus cycle.
Burst addressing
For memory
space accesses, the words in a burst may be accessed in several orders. The
unnecessary low-order address bits AD[1:0] are used to convey the initiator's
requested order. A target which does not support a particular order must
terminate the burst after the first word. Some of these orders depend on the
cache line size, which is configurable on all PCI devices.
PCI
burst ordering |
||
A[1] |
A[0] |
Burst
order (with 16-byte cache line) |
0 |
0 |
Linear incrementing (0x0C, 0x10,
0x14, 0x18, 0x1C, ...) |
0 |
1 |
Cacheline toggle (0x0C, 0x08,
0x04, 0x00, 0x1C, 0x18, ...) |
1 |
0 |
Cacheline wrap (0x0C, 0x00, 0x04,
0x08, 0x1C, 0x10, ...) |
1 |
1 |
Reserved (disconnect after first
transfer) |
If the starting
offset within the cache line is zero, all of these modes reduce to the same
order.
Cache line
toggle and cache line wrap modes are two forms of critical-word-first cache
line fetching. Toggle mode XORs the supplied address with an incrementing counter.
This is the native order for Intel 486 and Pentium processors. It has the
advantage that it is not necessary to know the cache line size to implement it.
PCI version 2.1
obsoleted toggle mode and added the cache line wrap mode,[1]
where fetching proceeds linearly, wrapping around at the end of each cache
line. When one cache line is completely fetched, fetching jumps to the starting
offset in the next cache line.
Note that most
PCI devices only support a limited range of typical cache line sizes; if the
cache line size is programmed to an unexpected value, they force single-word
access.
PCI also
supports burst access to I/O and configuration space, but only linear mode is
supported. (This is rarely used, and may be buggy in some devices; they may not
support it, but not properly force single-word access either.)
Transaction examples
This is the
highest-possible speed four-word write burst, terminated by the master:
0_
1_ 2_ 3_
4_ 5_ 6_ 7_
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \
___ ___ ___ ___ ___
AD[31:0] ---<___X___X___X___X___>---<___>
___ ___ ___ ___ ___
C/BE[3:0]#
---<___X___X___X___X___>---<___>
|
| | | ___
IRDY# ^^^^^^^^\______________/
^^^^^
| |
| | ___
TRDY# ^^^^^^^^\______________/
^^^^^
| |
| | ___
DEVSEL# ^^^^^^^^\______________/
^^^^^
___ |
| | ___
FRAME# \_______________/ |
^^^^\____
_ _
|_ |_ |_
|_ _ _
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \
0
1 2 3
4 5 6 7
On clock edge
1, the initiator starts a transaction by driving an address, command, and
asserting FRAME# The other signals are idle (indicated by ^^^), pulled high by
the motherboard's pull-up resistors. That might be their turnaround cycle. On
cycle 2, the target asserts both DEVSEL# and TRDY#. As the initiator is also
ready, a data transfer occurs. This repeats for three more cycles, but before
the last one (clock edge 5), the master deasserts FRAME#, indicating that this
is the end. On clock edge 6, the AD bus and FRAME# are undriven (turnaround
cycle) and the other control lines are driven high for 1 cycle. On clock edge
7, another initiator can start a different transaction. This is also the
turnaround cycle for the other control lines.
The equivalent
read burst takes one more cycle, because the target must wait 1 cycle for the
AD bus to turn around before it may assert TRDY#:
0_
1_ 2_ 3_
4_ 5_ 6_
7_ 8_
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \
___ ___ ___ ___ ___
AD[31:0] ---<___>---<___X___X___X___>---<___>
___ _______ ___ ___ ___
C/BE[3:0]#
---<___X_______X___X___X___>---<___>
___ |
| | | ___
IRDY#
^^^^\___________________/
^^^^^
___ _____ |
| | | ___
TRDY# ^^^^ \______________/ ^^^^^
___ |
| | | ___
DEVSEL#
^^^^\___________________/ ^^^^^
___ |
| | ___
FRAME# \___________________/ |
^^^^\____
_ _
_ |_ |_
|_ |_ _ _
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \
0
1 2 3
4 5 6
7 8
A high-speed
burst terminated by the target will have an extra cycle at the end:
0_
1_ 2_ 3_
4_ 5_ 6_
7_ 8_
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \
___ ___ ___ ___ ___
AD[31:0] ---<___>---<___X___X___X___XXXX>----
___ _______ ___ ___ ___ ___
C/BE[3:0]#
---<___X_______X___X___X___X___>----
| |
| | ___
IRDY# ^^^^^^^\_______________________/
_____ | |
| | _______
TRDY# ^^^^^^^
\______________/
________________ |
___
STOP# ^^^^^^^ | | |
\_______/
| |
| | ___
DEVSEL# ^^^^^^^\_______________________/
___ |
| | | ___
FRAME#
\_______________________/ ^^^^
_ _
_ |_ |_
|_ |_ _ _
CLK _/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \
0
1 2 3
4 5 6
7 8
On clock edge
6, the target indicates that it wants to stop (with data), but the initiator is
already holding IRDY# low, so there is a fifth data phase (clock edge 7),
during which no data is transferred.
Parity
The PCI bus
detects parity errors, but does not attempt to correct them by retrying
operations; it is purely a failure indication. Because of this, there is no
need to detect the parity error before it has happened, and the PCI bus
actually detects it a few cycles later. During a data phase, whichever device
is driving the AD[31:0] lines computes even parity over them and the C/BE[3:0]#
lines, and sends that out the PAR line one cycle later. All access rules and
turnaround cycles for the AD bus apply to the PAR line, just one cycle later.
The device listening on the AD bus checks the received parity and asserts the
PERR# (parity error) line one cycle after that. This generally generates a
processor interrupt, and the processor can search the PCI bus for the device
which detected the error.
The PERR# line
is only used during data phases, once a target has been selected. If a parity
error is detected during an address phase (or the data phase of a Special
Cycle), the devices which observe it assert the SERR# (System error) line.
Even when some
bytes are masked by the C/BE# lines and not in use, they must still have some
defined value, and this value must be used to compute the parity.
Fast back-to-back transactions
Due to the need
for a turnaround cycle between different devices driving PCI bus signals, in
general it is necessary to have an idle cycle between PCI bus transactions.
However, in some circumstances it is permitted to skip this idle cycle, going
directly from the final cycle of one transfer (IRDY# asserted, FRAME#
deasserted) to the first cycle of the next (FRAME# asserted, IRDY# deasserted).
An initiator
may only perform back-to-back transactions when:
- they are
by the same initiator (or there would be no time to turn around the C/BE#
and FRAME# lines),
- the first
transaction was a write (so there is no need to turn around the AD bus),
and
- the
initiator still has permission (from its GNT# input) to use the PCI bus.
Additional
timing constraints may come from the need to turn around are the target control
lines, particularly DEVSEL#. The target deasserts DEVSEL#, driving it high, in
the cycle following the final data phase, which in the case of back-to-back
transactions is the first cycle of the address phase. The second cycle of the
address phase is then reserved for DEVSEL# turnaround, so if the target is
different from the previous one, it must not assert DEVSEL# until the third
cycle (medium DEVSEL speed).
One case where
this problem cannot arise is if the initiator knows somehow (presumably because
the addresses share sufficient high-order bits) that the second transfer is
addressed to the same target as the previous one. In that case, it may perform
back-to-back transactions. All PCI targets must support this.
It is also possible
for the target keeps track of the requirements. If it never does fast DEVSEL,
they are met trivially. If it does, it must wait until medium DEVSEL time
unless:
- the
current transaction was preceded by an idle cycle (is not back-to-back),
or
- the previous
transaction was to the same target, or
- the
current transaction began with a double address cycle.
Targets which
have this capability indicate it by a special bit in a PCI configuration
register, and if all targets on a bus have it, all initiators may use
back-to-back transfers freely.
A subtractive
decoding bus bridge must know to expect this extra delay in the event of
back-to-back cycles in order to advertise back-to-back support.
64-bit PCI
This section explains only basic 64-bit
PCI; the full PCI-X
protocol extension is much more extensive.
The PCI
specification includes optional 64-bit support. This is provided via an
extended connector which provides the 64-bit bus extensions AD[63:32],
C/BE[7:4]#, and PAR64. (It also provides a number of additional power and
ground pins.)
Memory
transactions between 64-bit devices may use all 64 bits to double the data
transfer rate. Non-memory transactions (including configuration and I/O space
accesses) may not use the 64-bit extension. During a 64-bit burst, burst
addressing works just as in a 32-bit transfer, but the address is incremented
twice per data phase. The starting address must be 64-bit aligned; i.e. AD2
must be 0. The data corresponding to the intervening addresses (with AD2 = 1)
is carried on the upper half of the AD bus.
To initiate a
64-bit transaction, the initiator drives the starting address on the AD bus and
asserts REQ64# at the same time as FRAME#. If the selected target can support a
64-bit transfer for this transaction, it replies by asserting ACK64# at the
same time as DEVSEL#. Note that a target may decide on a per-transaction basis
whether to allow a 64-bit transfer.
If REQ64# is
asserted during the address phase, the initiator also drives the high 32 bits
of the address and a copy of the bus command on the high half of the bus. If
the address requires 64 bits, a dual address cycle is still required, but the
high half of the bus carries the upper half of the address and the final
command code during both address phase cycles; this allows a 64-bit target to
see the entire address and begin responding earlier.
If the
initiator sees DEVSEL# asserted without ACK64#, it performs 32-bit data phases.
The data which would have been transferred on the upper half of the bus during
the first data phase is instead transferred during the second data phase.
Typically, the initiator drives all 64 bits of data before seeing DEVSEL#. If
ACK64# is missing, it may cease driving the upper half of the data bus.
The REQ64# and
ACK64# lines are held asserted for the entire transaction save the last data
phase, and deasserted at the same time as FRAME# and DEVSEL#, respectively.
The PAR64 line
operates just like the PAR line, but provides even parity over AD[63:32] and
C/BE[7:4]#. It is only valid for address phases if REQ64# is asserted. PAR64 is
only valid for data phases if both REQ64# and ACK64# are asserted.
Cache snooping (obsolete)
PCI originally
included optional support for write-back cache
coherence. This required support by cacheable memory targets, which would
listen to two pins from the cache on the bus, SDONE (snoop done) and SBO#
(snoop backoff).
Because this
was rarely implemented in practice, it was deleted from revision 2.2 of the PCI
specification, and the pins re-used for SMBus access in
revision 2.3.
The cache would
watch all memory accesses, without asserting DEVSEL#. If it noticed an access
that might be cached, it would drive SDONE low (snoop not done). A
coherence-supporting target would avoid completing a data phase (asserting
TRDY#) until it observed SDONE high.
In the case of
a write to data that was clean in the cache, the cache would only have to
invalidate its copy, and would assert SDONE as soon as this was established.
However, if the cache contained dirty data, the cache would have to write it
back before the access could proceed. so it would assert SBO# when raising
SDONE. This would signal the active target to assert STOP# rather than TRDY#,
causing the initiator to disconnect and retry the operation later. In the
meantime, the cache would arbitrate for the bus and write its data back to
memory.
Targets
supporting cache coherency are also required to terminate bursts before they
cross cache lines.
Development tools
When developing
and/or troubleshooting the PCI bus, examination of hardware signals can be very
important. Logic analyzers and bus
analyzers are tools which collect, analyze, decode signals for users to
view in useful ways.
In IBM
PC Compatible computers, the basic input/output system (BIOS), also
known as the System BIOS (pronounced /ˈbaɪ.oʊs/), is a de
facto standard defining a firmware interface.[1]
Phoenix
AwardBIOS CMOS (non-volatile memory) Setup utility on a
standard PC
The BIOS
software is built into the PC, and is the first code run by a PC
when powered on ('boot firmware'). The primary function of the BIOS is to load
and start an operating system. When the PC starts up, the first
job for the BIOS is to initialize and identify system devices such as the video display card, keyboard and mouse, hard disk, CD/DVD
drive and other hardware. The BIOS then locates software held on a peripheral
device (designated as a 'boot device'), such as a hard disk or a CD, and loads
and executes that software, giving it control of the PC.[2] This process
is known as booting, or booting up, which is short for bootstrapping.
BIOS software
is stored on a non-volatile ROM
chip built into the system on the mother board.
The BIOS software is specifically designed to work with the particular type of
system in question, including having a knowledge of the workings of various
devices that make up the complementary chipset of the system. In modern
computer systems, the BIOS chip's contents can be rewritten allowing BIOS software
to be upgraded.
A BIOS will
also have a user interface (or UI for short). Typically this is
a menu system accessed by pressing a certain key on the keyboard when the PC
starts. In the BIOS UI, a user can:
- configure
hardware
- set the
system clock
- enable or
disable system components
- select
which devices are eligible to be a potential boot device
- set
various password prompts, such as a password for securing access to the
BIOS UI functions itself and preventing malicious users from booting the
system from unauthorized peripheral devices.
The BIOS
provides a small library of basic input/output functions used to operate and
control the peripherals such as the keyboard, text display functions and so
forth, and these software library functions are callable by external software.
In the IBM PC and AT, certain peripheral cards such as hard-drive controllers
and video display adapters carried their own BIOS extension ROM,
which provided additional functionality. Operating
systems and executive software, designed to supersede this basic firmware
functionality, will provide replacement software interfaces to applications.
The role of the
BIOS has changed over time; today BIOS is a legacy system, superseded by
the more complex EFI (EFI), but BIOS remains in
widespread use, and EFI booting has only been supported in Microsoft OS
products supporting GPT [3] and Linux
Kernels 2.6.1 and greater builds [4]
BIOS is
primarily associated with the 16-bit, 32-bit, and the beginning of the 64-bit
architecture eras, while EFI is used for some newer 32-bit and 64-bit
architectures. Today BIOS is primarily used for booting a system and for video
initialization (in X.org); but otherwise is not used during the ordinary
running of a system, while in early systems (particularly in the 16-bit era),
BIOS was used for hardware access – operating systems (notably MS-DOS) would call
the BIOS rather than directly accessing the hardware. In the 32-bit era and
later, operating systems instead generally directly accessed the hardware using
their own device drivers. However, the distinction between BIOS
and EFI is rarely made in terminology by the average computer user, making BIOS
a catch-all term for both systems.
Contents [hide] |
[edit] Terminology
The term first
appeared in the CP/M
operating system, describing the part of CP/M loaded during boot time that
interfaced directly with the hardware
(CP/M machines usually had only a simple boot loader
in their ROM). Most versions of DOS have a file called
"IBMBIO.COM"
or "IO.SYS"
that is analogous to the CP/M BIOS.
Among other
classes of computers, the generic terms boot
monitor, boot loader or boot ROM were
commonly used. Some Sun and PowerPC-based computers use Open
Firmware for this purpose. There are a few alternatives for Legacy BIOS in
the x86 world: Extensible Firmware Interface, Open
Firmware (used on the OLPC XO-1) and coreboot.
[edit] IBM PC-compatible
BIOS chips
In principle,
the BIOS in ROM was customized to the particular manufacturer's hardware,
allowing low-level services (such as reading a keystroke or writing a sector of
data to diskette) to be provided in a standardized way to the operating system.
For example, an IBM PC might have had either a monochrome or a color display
adapter, using different display memory addresses and hardware - but the BIOS
service to print a character on the screen in text mode would be the same.
Boot Block |
DMI Block |
Main Block |
PhoenixBIOS D686. This BIOS chip is housed in
a PLCC package, which is, in turn,
plugged into a PLCC socket.
Prior to the
early 1990s, BIOSes were stored in ROM
or PROM chips, which could not be
altered by users. As its complexity and need for updates grew, and
re-programmable parts became more available, BIOS firmware was most commonly
stored on EEPROM
or flash
memory devices. According to Robert Braver, the president of the BIOS
manufacturer Micro Firmware, Flash BIOS chips became common around 1995
because the electrically erasable PROM (EEPROM) chips are cheaper and easier to
program than standard erasable PROM (EPROM) chips. EPROM
chips may be erased by prolonged exposure to ultraviolet light, which accessed
the chip via the window. Chip manufacturers use EPROM programmers (blasters) to
program EPROM chips. Electrically erasable (EEPROM) chips come with the
additional feature of allowing a BIOS reprogramming via higher-than-normal
amounts of voltage.[5]
BIOS versions are upgraded to take advantage of newer versions of hardware and
to correct bugs in previous revisions of BIOSes.[6]
Beginning with
the IBM AT, PCs supported a hardware clock settable through BIOS. It had a
century bit which allowed for manually changing the century when the year 2000
happened. Most BIOS revisions created in 1995 and nearly all BIOS revisions in
1997 supported the
year 2000 by setting the century bit automatically when the clock rolled
past midnight, December 31, 1999.[7]
The first flash
chips were attached to the ISA bus. Starting in 1997, the BIOS
flash moved to the LPC bus, a functional replacement for ISA, following
a new standard implementation known as "firmware hub" (FWH). In 2006,
the first systems supporting a Serial Peripheral Interface (SPI)
appeared, and the BIOS flash moved again.
The size of the
BIOS, and the capacities of the ROM, EEPROM and other media it may be stored
on, has increased over time as new features have been added to the code; BIOS
versions now exist with sizes up to 16 megabytes. Some modern motherboards are
including even bigger NAND Flash ROM ICs on board which are capable of storing whole
compact operating system distribution like some Linux distributions.
For example, some recent ASUS motherboards included SplashTop
Linux embedded into their NAND Flash ROM ICs.
[edit] Flashing the BIOS
In modern PCs
the BIOS is stored in rewritable memory, allowing the contents to be replaced or
'rewritten'. This rewriting of the contents is sometimes termed 'flashing'. This is done by a special program,
usually provided by the system's manufacturer. A file containing such contents
is sometimes termed 'a BIOS image'. A BIOS might be reflashed in order to
upgrade to a newer version to fix bugs or provide improved performance or to
support newer hardware, or a reflashing operation might be needed to fix a
damaged BIOS.
[edit] BIOS chip
vulnerabilities
An American Megatrends BIOS registering the “Intel CPU uCode
Error” while doing POST, most likely a problem with the POST.
EEPROM chips are
advantageous because they can be easily updated by the user; hardware
manufacturers frequently issue BIOS updates to upgrade their products, improve
compatibility and remove bugs. However, this advantage had the risk that an
improperly executed or aborted BIOS update could render the computer or device
unusable. To avoid these situations, more recent BIOSes use a "boot
block"; a portion of the BIOS which runs first and must be updated
separately. This code verifies if the rest of the BIOS is intact (using hash
checksums or
other methods) before transferring control to it. If the boot block detects any
corruption in the main BIOS, it will typically warn the user that a recovery
process must be initiated by booting from removable
media (floppy, CD or USB memory) so the user can try flashing the BIOS
again. Some motherboards have a backup BIOS (sometimes
referred to as DualBIOS boards) to recover from BIOS corruptions.
[edit] Overclocking
Some BIOS chips
allow overclocking,
an action in which the CPU
is adjusted to a higher clock rate than its factory preset. Overclocking may,
however, seriously compromise system reliability in insufficiently cooled
computers and generally shorten component lifespan. Overclocking, incorrectly
performed, may also cause component temperatures to rise so quickly that they destroy
themselves.
[edit] Virus attacks
There are at
least three known BIOS attack viruses, two of which were for demonstration
purposes.
[edit] CIH
The first was a
virus which was able to erase Flash ROM BIOS content, rendering computer
systems unstable. CIH, also known as "Chernobyl
Virus", appeared for the first time in mid-1998 and became active in
April 1999. It affected systems' BIOS's and often they could not be fixed on
their own since they were no longer able to boot at all. To repair this, Flash
ROM IC had to be removed from the motherboard to be reprogrammed elsewhere.
Damage from CIH was possible since the virus was specifically targeted at the
then widespread Intel i430TX motherboard chipset, and the most common operating
systems of the time were based on the Windows 9x
family allowing direct hardware access to all programs.
Modern systems
are not vulnerable to CIH because of a variety of chipsets being used which are
incompatible with the Intel i430TX chipset, and also other Flash ROM IC types.
There is also extra protection from accidental BIOS rewrites in the form of
boot blocks which are protected from accidental overwrite or dual and quad BIOS
equipped systems which may, in the event of a crash, use a backup BIOS. Also,
all modern operating systems like Linux, Mac OS X, Windows NT-based Windows OS like Windows
2000, Windows
XP and newer, do not allow user mode programs to have direct hardware
access. As a result, as of 2008, CIH has become essentially harmless, at worst
causing annoyance by infecting executable files and triggering alerts from
antivirus software. Other BIOS viruses remain possible, however[8]; since
Windows users without Windows Vista/7's UAC run all applications with
administrative privileges, a modern CIH-like virus could in principle still
gain access to hardware.
[edit] Black Hat 2006
The second one
was a technique presented by John Heasman, principal security consultant for UK
based Next-Generation Security Software at the Black Hat Security Conference
(2006), where he showed how to elevate privileges and read physical memory,
using malicious procedures that replaced normal ACPI functions stored in flash
memory.
[edit] Persistent BIOS
Infection
The third one,
known as "Persistent BIOS infection", was a method presented in
CanSecWest Security Conference (Vancouver, 2009) and SyScan Security Conference
(Singapore, 2009) where researchers Anibal Sacco [9] and Alfredo
Ortega, from Core Security Technologies, demonstrated insertion of malicious
code into the decompression routines in the BIOS, allowing for nearly full
control of the PC at every start-up, even before the operating system is
booted.
The
proof-of-concept does not exploit a flaw in the BIOS implementation, but only
involves the normal BIOS flashing procedures. Thus, it requires physical access
to the machine or for the user on the operating system to be root. Despite
this, however, researchers underline the profound implications of their
discovery: “We can patch a driver to drop a fully working rootkit. We even have
a little code that can remove or disable antivirus.”[10]
[edit] Firmware on adapter
cards
A computer
system can contain several BIOS firmware chips. The motherboard BIOS typically
contains code to access hardware components absolutely necessary for
bootstrapping the system, such as the keyboard (either PS/2 or on a USB human interface device), and storage (floppy
drives, if available, and IDE or SATA hard disk controllers). In addition,
plug-in adapter cards such as SCSI, RAID,
Network interface cards, and video boards
often include their own BIOS (e.g. Video BIOS),
complementing or replacing the system BIOS code for the given component. (This
code is generally referred to as an option ROM.)
Even devices built into the motherboard can behave in this way; their option
ROMs can be stored as separate code on the main BIOS flash chip, and upgraded
either in tandem with, or separately to, the main BIOS.
An add-in card
usually only requires an option ROM if it:
- Needs to
be used before the operating system can be loaded (usually this means it
is required in the bootstrapping process), and
- Is too
sophisticated or specific a device to be handled by the main BIOS
Older PC
operating
systems, such as MS-DOS (including all DOS-based versions of Microsoft
Windows), and early-stage bootloaders, may continue to use the BIOS for input
and output. However, the restrictions of the BIOS environment means that modern
OSes will almost always use their own device
drivers to directly control the hardware. Generally, these device drivers only
use BIOS and option ROM calls for very specific (non-performance-critical)
tasks, such as preliminary device initialization.
In order to
discover memory-mapped ISA option ROMs during the boot
process, PC BIOS implementations scan real memory from 0xC0000 to 0xF0000 on 2 KiB boundaries, looking
for a ROM signature: 0xAA55 (0x55 followed by 0xAA, since the x86 architecture is little-endian).
In a valid expansion ROM, this signature is immediately followed by a single
byte indicating the number of 512-byte blocks it occupies in real memory. The
next byte contains an offset describing the option ROM's entry point,
to which the BIOS immediately transfers control. At this point, the expansion
ROM code takes over, using BIOS services to register interrupt
vectors for use by post-boot applications, provide a user configuration
interface, or display diagnostic information.
There are many
methods and utilities for examining the contents of various motherboard BIOS
and expansion ROMs, such as Microsoft DEBUG or the UNIX dd.
[edit] BIOS boot specification
If the
expansion ROM wishes to change the way the system boots (such as from a network
device or a SCSI adapter for which the BIOS has no driver code), it can use the
BIOS Boot Specification (BBS) API to register its ability to do
so. Once the expansion ROMs have registered using the BBS APIs, the user can
select among the available boot options from within the BIOS's user interface.
This is why most BBS compliant PC BIOS implementations will not allow the user
to enter the BIOS's user interface until the expansion ROMs have finished
executing and registering themselves with the BBS API.[citation needed]
[edit] Changing role of the
BIOS
Some operating
systems, for example MS-DOS, rely on the BIOS to carry out most input/output tasks
within the PC.[11]
A variety of technical reasons makes it inefficient for some recent operating
systems written for 32-bit
CPUs such as Linux
and Microsoft Windows to invoke the BIOS directly.
Larger, more powerful, servers and workstations using PowerPC or SPARC CPUs by several
manufacturers developed a platform-independent Open
Firmware (IEEE-1275), based on the Forth programming language. It is
included with Sun's SPARC computers, IBM's RS/6000 line,
and other PowerPC CHRP motherboards. Later
x86-based personal computer operating systems, like Windows NT, use their own,
native drivers which also makes it much easier to extend support to new
hardware, while the BIOS still relies on a legacy 16-bit real mode
runtime interface.
There was a
similar transition for the Apple Macintosh, where the system software
originally relied heavily on the ToolBox—a
set of drivers and other useful routines stored in ROM based on Motorola's
680x0 CPUs. These Apple ROMs were replaced by Open Firmware in the PowerPC Macintosh,
then EFI in Intel Macintosh computers.
Later BIOS took
on more complex functions, by way of interfaces such as ACPI; these functions
include power management, hot
swapping, thermal management. To quote Linus
Torvalds, the task of BIOS is "just load the OS
and get the hell out of there". However BIOS limitations (16-bit processor
mode, only 1 MiB addressable space, PC AT hardware dependencies, etc.) were
seen as clearly unacceptable for the newer computer platforms. Extensible Firmware Interface (EFI)
is a specification which replaces the runtime interface of the legacy BIOS.
Initially written for the Itanium architecture, EFI is now available for x86 and x86-64 platforms;
the specification development is driven by The Unified
EFI Forum, an industry Special Interest Group.
Linux has
supported EFI via the elilo
and GRUB boot
loaders. The Open Source community increased their effort to develop a
replacement for proprietary BIOSes and their future incarnations with an open
sourced counterpart through the coreboot and OpenBIOS/Open Firmware projects. AMD provided product specifications for some
chipsets, and Google
is sponsoring the project. Motherboard manufacturer Tyan offers coreboot next
to the standard BIOS with their Opteron line of motherboards. MSI and Gigabyte Technology have followed suit with the
MSI K9ND MS-9282 and MSI K9SD MS-9185 resp. the M57SLI-S4 models.
Some BIOSes
contain a "SLIC", a digital signature placed inside the BIOS by the
manufacturer, for example Dell. This SLIC is inserted in the ACPI table and
contains no active code. Computer manufacturers that distribute OEM versions of
Microsoft Windows and Microsoft application software can use the SLIC to
authenticate licensing to the OEM Windows Installation disk and/or system recovery
disc containing Windows software. Systems having a SLIC can be activated
with an OEM Product Key, and they verify an XML formatted OEM certificate
against the SLIC in the BIOS as a means of self-activating. If a user performs
a fresh install of Windows, they will need to have possession of both the OEM
key and the digital certificate for their SLIC in order to bypass activation;
in practice this is extremely unlikely and hence the only real way this can be
achieved is if the user performs a restore using a pre-customised image
provided by the OEM.
Recent Intel processors (P6
and P7) have reprogrammable microcode. The BIOS may contain patches to the processor
code to allow errors in the initial processor code to be fixed, updating the
processor microcode each time the system is powered up. Otherwise, an expensive
processor swap would be required.[12] For
example, the Pentium FDIV bug became an expensive fiasco for
Intel that required a product recall because the original Pentium did not
have patchable microcode.
[edit] The BIOS business
The vast
majority of PC motherboard suppliers license a BIOS "core" and
toolkit from a commercial third-party, known as an "independent BIOS
vendor" or IBV. The motherboard manufacturer then customizes this BIOS to
suit its own hardware. For this reason, updated BIOSes are normally obtained
directly from the motherboard manufacturer.
Major BIOS
vendors include American Megatrends (AMI), Insyde
Software, Phoenix Technologies and Byosoft.
Former vendors include Award Software and Microid
Research which were acquired by Phoenix Technologies in 1998. Phoenix has now
phased out the Award Brand name. General
Software, which was also acquired by Phoenix in 2007, sold BIOS for
Intel processor based embedded systems.
Monitors have evolved
from simple monochrome displays to today’s high-resolution Super VGA. Monitors
vary in price because they also vary in quality. A monitor, no matter what its
size, is only capable of producing so many colors and resolutions.
How a monitor works
How do all those
colors get to the monitor? The back of the monitor’s
screen, which is called a cathode ray tube (CRT), is coated with phosphors. Aimed
at the CRT is an electron gun. As the video card sends a signal
to the electron gun, the gun shoots electrons at the CRT, causing the phosphors
on the CRT to glow.
The gun fires
constantly at the CRT from left to right and from top to bottom. The glow of
three phosphors, red, green and blue, creates 1 pixel. The combinations of lit
pixels create a pattern recognizable to the human eye, and the speed at which
the images on the screen change presents the illusion of movement and a flow of
colors onscreen.
At some point, you may
have run across a monitor’s refresh rate, whether you read the monitor’s manual
or saw it on the monitor’s specifications label. A monitor’s refresh rate refers
to how often the electron gun is capable of redrawing the screen. Another term
often included with the refresh rate is dot pitch. A monitor’s dot pitch
is simply the distance (in millimeters) between two dots of the same color on a
monitor. For example, an average monitor would have a dot pitch of .28
millimeters. An above average monitor may offer dot pitch as tight as .24 mm.
Monochrome video
The first monitor type
for PCs was a monochrome. As its name implies, with monochrome you got one
color (well two if you count the black screen behind the colored text). Early
monochrome monitors could display only text—no graphics— at a simple resolution
of 720 × 350, which was fine for characters. The Hercules Graphics card
followed suit (1982) with the same resolution, but offered the ability to
display graphics. Graphics could be displayed as the card used a library of
characters for text mode and a more intense mode for drawing graphics. To the
user, the switch between these two modes was invisible.
Color video
Most models of
monitors available today are at least VGA. VGA (which stands for video graphics
adapter) allows the monitor to send an analog signal to X that controls the
flow and depth of colors more superbly than pre-VGA models allowed. Super
VGA-capable monitors allow monitors to display higher resolutions and richer
colors. The amount of RAM on the video card determines the number of colors the
monitor can display. Some monitors and video cards promise True Color, which
allows for up to 16 million colors.
IBM introduced its
first color monitors (1981) with the Color Graphics Adapter (CGA), which had
the ability to use four colors at a pathetic resolution of only 320 × 200. The
card could be switched to two colors, which would result in a slightly higher
resolution.
In 1985, IBM
introduced the Enhanced Graphics Adapter (EGA) that could display 16 colors at
a resolution of 320 × 200 or 640 × 350.
IBM later introduced
(1987) an adapter capable of even higher resolution with its Video Graphics
Adapter (VGA). The original VGA card had 256K of memory and the ability to
display 16 colors at 640 × 480 or 256 colors at 320 @ts 200. As you can see,
the higher the amount of colors used results in a lower resolution. This card
is the bare minimum for today’s monitors and video cards. VGA uses analog and
allows users to select from over 260,000 shades of colors.
Video Electronics
Standard Association
As you can tell from
reading the preceding sections, IBM was, for the most part, in control of the
standards for color video adapters and monitors. The Video Electronics Standard
Association (VESA) is a collection of manufacturers that later set out to
improve on IBM’s video technologies. The result was the Super VGA video card.
While it’s not the most creatively named card, it is, well, super (at least in
comparison to its predecessor VGA). SVGA can support:
- 256 colors at a resolution of 800 × 600
- 16 colors at 1,024 × 768
- 65,536 colors at 640 × 480
DMA Channels
Direct memory access
From Wikipedia, the free encyclopedia
Jump to: navigation,
search
Direct memory
access (DMA)
is a feature of modern computers and microprocessors
that allows certain hardware subsystems within the computer to access system memory
for reading and/or writing independently of the central processing unit. Many hardware
systems use DMA including disk drive controllers, graphics
cards, network cards and sound cards.
DMA is also used for intra-chip data transfer in multi-core
processors, especially in multiprocessor system-on-chips, where its processing element is
equipped with a local memory (often called scratchpad
memory) and DMA is used for transferring data between the local memory and
the main memory. Computers that have DMA channels can transfer data to and from
devices with much less CPU overhead than computers without a DMA
channel. Similarly a processing element inside a multi-core processor can
transfer data to and from its local memory without occupying its processor time
and allowing computation and data transfer concurrency.
Without DMA,
using programmed input/output (PIO) mode for
communication with peripheral devices, or load/store instructions in the case
of multicore chips, the CPU is typically fully occupied for the entire duration
of the read or write operation, and is thus unavailable to perform other work.
With DMA, the CPU would initiate the transfer, do other operations while the
transfer is in progress, and receive an interrupt from the DMA controller once
the operation has been done. This is especially useful in real-time computing applications where not
stalling behind concurrent operations is critical. Another and related
application area is various forms of stream
processing where it is essential to have data processing and transfer in
parallel, in order to achieve sufficient throughput.
Contents [hide] |
[edit] Principle
DMA is an
essential feature of all modern computers, as it allows devices to transfer
data without subjecting the CPU to a heavy overhead. Otherwise, the CPU would
have to copy each piece of data from the source to the destination, making
itself unavailable for other tasks. This situation is aggravated because access
to I/O devices over a peripheral bus is generally slower than normal system
RAM. With DMA, the CPU gets freed from this overhead and can do useful tasks
during data transfer (though the CPU bus would be partly blocked by DMA). In
the same way, a DMA engine in an embedded processor allows its processing
element to issue a data transfer and carries on its own task while the data
transfer is being performed.
A DMA transfer
copies a block of memory from one device to another. While the CPU initiates
the transfer by issuing a DMA command, it does not execute it. For so-called
"third party" DMA, as is normally used with the ISA bus, the transfer is performed
by a DMA controller which is typically part of the motherboard chipset. More
advanced bus designs such as PCI typically use bus
mastering DMA, where the device takes control of the bus and performs the
transfer itself. In an embedded processor or multiprocessor
system-on-chip, it is a DMA engine connected to the on-chip bus that
actually administers the transfer of the data, in coordination with the flow
control mechanisms of the on-chip bus.
A typical usage
of DMA is copying a block of memory from system RAM to a buffer on the device
or vice versa. Such an operation usually does not stall the processor, which as
a result can be scheduled to perform other tasks unless those tasks include a
read from or write to memory. DMA is essential to high performance embedded
systems. It is also essential in providing so-called zero-copy
implementations of peripheral device
drivers as well as functionalities such as network packet
routing, audio playback and streaming
video. Multicore embedded processors (in the form of multiprocessor
system-on-chip) often use one or more DMA engines in combination with scratchpad
memories for both increased efficiency and lower power consumption. In computer
clusters for high-performance computing, DMA among
multiple computing nodes is often used under the name of remote DMA. There are two control
signal used to request and acknowledge a DMA transfer in microprocess-based
system.The HOLD pin is used to request a DMA action and the HLDA pin is an
output acknowledges the DMA action.
[edit] Cache coherency problem
DMA can lead to
cache
coherency problems. Imagine a CPU equipped with a cache and an external
memory that can be accessed directly by devices using DMA. When the CPU
accesses location X in the memory, the current value will be stored in the
cache. Subsequent operations on X will update the cached copy of X, but not the
external memory version of X. If the cache is not flushed to the memory before
the next time a device tries to access X, the device will receive a stale value
of X.
Similarly, if
the cached copy of X is not invalidated when a device writes a new value to the
memory, then the CPU will operate on a stale value of X.
This issue can
be addressed in one of two ways in system design: Cache-coherent systems
implement a method in hardware whereby external writes are signaled to the
cache controller which then performs a cache invalidation for DMA writes or cache flush
for DMA reads. Non-coherent systems leave this to software, where the OS must
then ensure that the cache lines are flushed before an outgoing DMA transfer is
started and invalidated before a memory range affected by an incoming DMA
transfer is accessed. The OS must make sure that the memory range is not
accessed by any running threads in the meantime. The latter approach introduces
some overhead to the DMA operation, as most hardware requires a loop to
invalidate each cache line individually.
Hybrids also
exist, where the secondary L2 cache is coherent while the L1 cache (typically
on-CPU) is managed by software.
[edit] DMA engine
In addition to
hardware interaction, DMA can also be used to offload expensive memory
operations, such as large copies or scatter-gather
operations, from the CPU to a dedicated DMA engine. Intel includes such engines
on high-end servers, called I/O Acceleration Technology (IOAT).
[edit] Examples
[edit] ISA
For example, a PC's
ISA DMA controller is based on the Intel 8237
Multimode DMA controller, that is a software-hardware combination which either
consists of or emulates this part. In the original IBM PC, there was
only one DMA controller capable of providing four DMA channels (numbered 0-3).
These DMA channels performed 8-bit transfers and could only address the first
megabyte of RAM. With the IBM PC/AT, a second 8237 DMA controller was added
(channels 5-7; channel 4 is unusable), and the page register was rewired to
address the full 16 MB memory address space of the 80286 CPU. This second
controller performed 16-bit transfers.
Due to their
lagging performance (2.5 Mbit/s[1]),
these devices have been largely obsolete since the advent of the 80386 processor and
its capacity for 32-bit transfers. They are still supported to the extent they
are required to support built-in legacy PC hardware on modern machines. The
only pieces of legacy hardware that use ISA DMA and are still fairly common are
the built-in Floppy disk controllers of many PC mainboards and those
IEEE 1284
parallel ports that support the fast ECP mode.
Each DMA
channel has a 16-bit address register and a 16-bit count register associated
with it. To initiate a data transfer the device driver sets up the DMA
channel's address and count registers together with the direction of the data
transfer, read or write. It then instructs the DMA hardware to begin the
transfer. When the transfer is complete, the device interrupts the CPU.
Scatter-gather
DMA allows the transfer of data to and from multiple memory areas in a single
DMA transaction. It is equivalent to the chaining together of multiple simple
DMA requests. The motivation is to off-load multiple input/output
interrupt and data copy tasks from the CPU.
DRQ stands for DMA
request; DACK for DMA acknowledge. These symbols, seen on hardware schematics of
computer systems with DMA functionality, represent electronic signaling lines
between the CPU and DMA controller. Each DMA channel has one Request and one
Acknowledge line. A properly configured device that uses DMA must be jumpered (or
software-configured) to use both lines of the assigned DMA channel.
Standard ISA
DMA assignments:
0 DRAM Refresh
(obsolete),
1 User hardware,
2 Floppy disk
controller,
3 Hard disk
(obsoleted by PIO modes, and replaced by UDMA modes),
4 Cascade from XT DMA controller,
5 Hard Disk (PS/2 only), user hardware
for all others,
6 User hardware,
7 User hardware.
[edit] PCI
As mentioned
above, a PCI architecture has no central
DMA controller, unlike ISA. Instead, any PCI component can request control of
the bus ("become the bus master") and request to read from and write to
system memory. More precisely, a PCI component requests bus ownership from the
PCI bus controller (usually the southbridge in a modern PC design), which
will arbitrate if several devices request bus
ownership simultaneously, since there can only be one bus master at one time.
When the component is granted ownership, it will issue normal read and write
commands on the PCI bus, which will be claimed by the bus controller and
forwarded to the memory controller using a scheme which is specific to every
chipset.
As an example,
on a modern AMD Socket AM2-based
PC, the southbridge will forward the transactions to the northbridge (which is integrated on the CPU
die) using HyperTransport, which will in turn convert them to DDR2 operations and
send them out on the DDR2 memory bus. As can be seen, there are quite a number
of steps involved in a PCI DMA transfer; however, that poses little problem, since
the PCI device or PCI bus itself are an order of magnitude slower than rest of
components (see list of device bandwidths).
A modern x86
CPU may use more than 4 GB of memory, utilizing PAE, a 36-bit addressing mode, or the
native 64-bit mode of x86-64 CPUs. In such a case, a device using DMA with a 32-bit
address bus is unable to address memory above the 4 GB line. The new Double
Address Cycle (DAC) mechanism, if implemented on both the PCI bus
and the device itself,[2]
enables 64-bit DMA addressing. Otherwise, the operating system would need to
work around the problem by either using costly double
buffers (Windows nomenclature) also known as bounce
buffers (Linux), or it could use an IOMMU to provide
address translation services if one is present.
[edit] IO Accelerator in Xeon
As an example
of DMA engine incorporated in a general-purpose CPU, newer Intel Xeon chipsets include a
DMA engine technology called I/O Acceleration Technology (I/OAT),
meant to improve network performance on high-throughput network interfaces, in
particular gigabit Ethernet and faster.[3]
However, various benchmarks with this approach by Intel's Linux
kernel developer Andrew Grover indicate no more than 10% improvement in CPU
utilization with receiving workloads, and no improvement when transmitting
data.[4]
[edit] AHB
In systems-on-a-chip
and embedded systems, typical system bus infrastructure
is a complex on-chip bus such as AMBA High-performance Bus.
AMBA defines two kinds of AHB components: master and slave. A slave interface
is similar to programmed I/O through which the software (running on embedded
CPU, e.g. ARM) can write/read I/O registers or (less
commonly) local memory blocks inside the device. A master interface can be used
by the device to perform DMA transactions to/from system memory without heavily
loading the CPU.
Therefore high
bandwidth devices such as network controllers that need to transfer huge
amounts of data to/from system memory will have two interface adapters to the
AHB bus: a master and a slave interface. This is because on-chip buses like AHB
do not support tri-stating the bus or alternating the direction
of any line on the bus. Like PCI, no central DMA controller is required since
the DMA is bus-mastering, but an arbiter is required in case of multiple
masters present on the system.
Internally, a
multichannel DMA engine is usually present in the device to perform multiple
concurrent scatter-gather operations as programmed by the
software.
[edit] Cell
As an example
usage of DMA in a multiprocessor-system-on-chip,
IBM/Sony/Toshiba's Cell processor incorporates a DMA engine for
each of its 9 processing elements including one Power processor element (PPE)
and eight synergistic processor elements (SPEs). Since the SPE's load/store
instructions can read/write only its own local memory, an SPE entirely depends
on DMAs to transfer data to and from the main memory and local memories of
other SPEs. Thus the DMA acts as a primary means of data transfer among cores
inside this CPU (in contrast to cache-coherent CMP
architectures such as Intel's coming general-purpose GPU, Larrabee).
DMA in Cell is
fully cache
coherent (note however local stores of SPEs operated upon by DMA do not act
as globally coherent cache in the standard sense).
In both read ("get") and write ("put"), a DMA command can
transfer either a single block area of size up to 16KB, or a list of 2 to 2048
such blocks. The DMA command is issued by specifying a pair of a local address
and a remote address: for example when a SPE program issues a put DMA command,
it specifies an address of its own local memory as the source and a virtual
memory address (pointing to either the main memory or the local memory of
another SPE) as the target, together with a block size. According to a recent
experiment, an effective peak performance of DMA in Cell (3 GHz, under uniform
traffic) reaches 200GB per second.[5]
I/O Slots
All motherboards have one or more
system I/O buses, that are used to expand the computer's capabilities. The slots
in the back of the machine are where expansion cards are placed (like your video card, sound
card, network card, etc.). These slots allow you to expand the capabilities of
your machine in many different ways, and the proliferation of both general
purpose and very specific expansion cards is part of the success story of the PC platform.
Most modern PCs have two different types of bus slots. The first is the
standard ISA (Industry Standard Architecture) slot; most PCs have 3 or 4 of
these. These slots have two connected sections and start about a half-inch from
the back of the motherboard, extending to around its middle. This is the oldest
(and slowest) bus type and is used for cards that don't require a lot of speed:
for example, sound cards and modems. Older systems (generally made well before
1990) may have ISA slots with only a single connector piece on each; these are
8-bit ISA slots and will (of course) only support 8-bit ISA cards.
Pentium systems and newer 486-class motherboards also have PCI (Peripheral
Component Interconnect) bus slots, again, usually 3 or 4. They are
distinguished from ISA slots in two ways. First, they are shorter, and second,
they are offset from the back edge of the motherboard by about an inch. PCI is a high-speed
bus used for devices like video cards, hard disk controllers, and
high-speed network cards.
Note: Newer PCI
motherboards have the connectors for the hard disks coming directly from the
motherboard. These connectors are part of the PCI bus, even though the hard
disks aren't connected to a physical PCI slot.
The newest PCs add another, new connector to the motherboard: an Accelerated
Graphics Port slot. AGP is not really a bus, but is a single-device port
used for high-performance graphics. The AGP slot looks similar to a PCI slot,
except that it is offset further from the back edge of the motherboard.
Older 486 systems use VESA Local Bus, or VLB slots instead of PCI to connect
high-speed devices. This is an older bus which began to be abandoned in favor
of PCI at around the time the Pentium was introduced. VLB slots look like ISA
slots, only they add third and fourth sections beyond the first two. This makes
their connectors very long, and for that reason VLB cards are notoriously
difficult to insert into or remove from the motherboard. Care must be exercised
to avoid damage.
Some motherboards incorporate a so-called "shared" ISA and PCI
slot. This name implies a single slot that can take either type of card, but
that isn't possible because the two slot types are physically incompatible. In
order to save space while maximizing the number of expansion slots, some
designers put an ISA slot on the board right next to a PCI slot; you then have
the choice to use either the ISA or the PCI slot, but not both. This design is
possible because ISA cards mount on the left-hand side of a slot position,
while PCI slots mount on the right-hand side.
Video Modes
Text and Graphical Modes
With the exception of the very earliest cards used on old PCs in the early
to mid 80s, all video cards are able to display information in either text or
graphical modes. In a text mode, video information is stored as characters in a
character set; usually on PCs this is the ASCII character set. A typical PC
text screen has 25 rows and 80 columns. The video card has built into it a
definition of what the dot shape is for each character, which it uses to
display the contents of the screen. You cannot access the individual dots that
make up the letter "M" on the screen. This is similar to how fonts
work in a dot matrix printer; when you type the letter "M", the
letter as stored as one or two bytes in the file (the extra byte is often for
attribute information such as color, underlining etc.). When you go to print
the "M" it is translated to a pattern of dots by the printer.
Graphical modes are of course totally different; here the dots on the screen
are manipulated directly, so both text and images are possible. The conversion
of letters, numbers etc. to visible images is done by software. This is the
concept behind fonts; open the same file and display it under a different font
and the appearance is totally different. Graphical modes allow for much more
flexibility in terms of what is displayed on the screen, but at a cost: they
require much more information to be manipulated, and also much more memory to
hold the screen image. The increase is significant: typically a factor of up to
100 times or more! This has led almost directly to the need for increased
hardware power in newer PCs.
Most PCs use both text and graphical modes, and can be switched between them
under software control. While most computing is now done in a graphics mode,
DOS is still text-based. PCs also generally boot up in a text or text-emulated
mode.
The first PCs used monochrome video only; color monitors became popular in
the mid to late 80s. Monochrome displays continued their dominance into the
early 90s in laptop PCs, where color displays were originally very expensive.
Today monochrome displays and the video cards that drive them are obsolete
except for specific industrial applications.
Most video cards today retain the ability to display information in a
monochrome text mode, for compatibility with (much) older software. In practice this mode is
very rarely if ever used. Even when monochrome text is displayed, this is
usually in a color mode.
Pixels and Resolution
The image that is displayed on the
screen is composed of thousands (or millions) of small dots; these are called pixels;
the word is a contraction of the phrase "picture element". A pixel
represents the smallest piece of the screen that can be controlled
individually. Each one can be set to a different color and intensity
(brightness).
The number of pixels that can be
displayed on the screen is referred to as the resolution of the image;
this is normally displayed as a pair of numbers, such as 640x480. The first is
the number of pixels that can be displayed horizontally on the screen, and the
second how many can be displayed vertically. The higher the resolution, the
more pixels that can be displayed and therefore the more that can be shown on
the monitor at once,
however, pixels are smaller at high resolution and detail can be hard to make
out on smaller screens. Resolutions generally fall into predefined standard
sets; only a few different resolutions are used by most PCs.
The aspect ratio of the image
is the ratio of the number of X pixels to the number of Y pixels. The standard
aspect ratio for PCs is 4:3, but some resolutions use a ratio of 5:4. Monitors
are calibrated to this standard so that you can draw a circle and have it
appear to be a circle and not an ellipse. Displaying an image that uses an
aspect ratio of 5:4 will cause the image to appear somewhat distorted. The only
mainstream resolution that currently uses 5:4 is the high-resolution 1280x1024.
There is some confusion regarding
the use of the term "resolution", since it can technically mean
different things. First, the resolution of the image you see is a function of
what the video card
outputs and what the monitor is capable of displaying; to see a high resolution
image such as 1280x1024 requires both a video card capable of producing an
image this large and a monitor capable of displaying it. Second, since each
pixel is displayed on the monitor as a set of three individual dots (red, green
and blue), some people use the term "resolution" to refer to the
resolution of the monitor, and the term "pixel addressability" to
refer to the number of discrete elements the video card produces. In practical
terms most people use resolution to refer to the video image, as I do on this
site.
The table below lists the most
common resolutions used on PCs and the number of pixels each uses:
Resolution |
Number
of Pixels |
Aspect
Ratio |
320x200 |
64,000 |
8:5 |
640x480 |
307,200 |
4:3 |
800x600 |
480,000 |
4:3 |
1024x768 |
786,432 |
4:3 |
1280x1024 |
1,310,720 |
5:4 |
1600x1200 |
1,920,000 |
4:3 |
Pixel Color and Intensity, Color
Depth and the Color Palette
Each pixel of the screen image is
displayed on a monitor using a combination of three different color signals:
red, green and blue. This is similar (but by no means identical) to how images
are displayed on a television set.
Each pixel's appearance is controlled by the intensity of these three beams of
light. When all are set to the highest level the result is white; when all are
set to zero the pixel is black, etc.
The amount of information that is
stored about a pixel determines its color depth, which controls how
precisely the pixel's color can be specified. This is also sometimes called the
bit depth, because the precision of color depth is specified in bits.
The more bits that are used per pixel, the finer the color detail of the image.
However, increased color depths also require significantly more memory for
storage of the image, and also more data for the video card to
process, which reduces the possible maximum refresh rate.
This table shows the color depths
used in PCs today:
Color
Depth |
Number
of Displayed Colors |
Bytes
of Storage Per Pixel |
Common
Name for Color Depth |
4-Bit |
16 |
0.5 |
Standard
VGA |
8-Bit |
256 |
1.0 |
256-Color
Mode |
16-Bit |
65,536 |
2.0 |
High
Color |
24-Bit |
16,777,216 |
3.0 |
True
Color |
True color is given that name
because three bytes of information are used, one for each of the red, blue and
green signals that make up each pixel. Since a byte has 256 different values
this means that each color can have 256 different intensities, allowing over 16
million different color possibilities. This allows for a very realistic
representation of the color of images, with no compromises necessary and no
restrictions on the number of colors an image can contain. In fact, 16 million
colors is more than the human eye can discern. True color is a necessity for
those doing high-quality photo editing, graphical design, etc.
Note: Some video cards actually have to use 32 bits of memory for
each pixel when operating in true color, due to how they use the video memory. See here for more details
on this.
High color uses two bytes of
information to store the intensity values for the three colors. This is done by
breaking the 16 bits into 5 bits for blue, 5 bits for red and 6 bits for green.
This means 32 different intensities for blue, 32 for red, and 64 for green.
This reduced color precision results in a slight loss of visible image quality,
but it is actually very slight--many people cannot see the differences between
true color and high color images unless they are looking for them. For this
reason high color is often used instead of true color--it requires 33% (or 50%
in some cases) less video memory, and it is also faster for the same reason.
Note: Some video modes use a slight variation on high color,
where only 15 bits are used. This means 5 bits for each color. The difference
is not noticeable at all.
In 256-color mode the PC has only 8
bits to use; this would mean something like 2 bits for blue and 3 for each of
green and red. Choosing between only 4 or 8 different values for each color
would result in rather hideously blocky color, so a different approach is taken
instead: the use of a palette. A palette is created containing 256
different colors. Each one is defined using the standard 3-byte color
definition that is used in true color: 256 possible intensities for each of
red, blue and green. Then, each pixel is allowed to choose one of the 256
colors in the palette, which can be considered a "color number" of
sorts. So the full range of color can be used in each image, but each image can
only use 256 of the available 16 million different colors. When each pixel is displayed,
the video card looks up the real red, green and blue values in the palette
based on the "color number" the pixel is assigned.
The palette is an excellent
compromise: it allows only 8 bits to be used to specify each color in an image,
but allows the creator of the image to decide what the 256 colors in the image
should be. Since virtually no images contain an even distribution of colors,
this allows for more precision in an image by using more colors than would be
possible by assigning each pixel a 2-bit value for blue and 3-bit values each
for green and red. For example, an image of the sky with clouds (like the
Windows 95 standard background) would have many different shades of blue, white
and gray, and virtually no reds, greens, yellows and the like.
256-color is the standard for much
of computing, mainly because the higher-precision color modes require more
resources (especially video memory) and aren't supported by many PCs. Despite
the ability to "hand pick" the 256 colors, this mode produces noticeably
worse image quality than high color; most people can tell the difference
between high color and 256-color mode.
00: Terminate Program |
31: Keep Program |
4A: Set Memory Block Size |
File management (Handles) 3C:
Create File with Handle Directory management 39: Create Directory Drive management 0D: Reset Drive File Control Blocks (FCBs) 0F: Open File with FCB |
I/O control (IOCTL) 4400: Get Device Data 440C: IOCTL for Char. Devices 45: Set Iteration Count 440D: IOCTL for Block Devices 40: Set Device Parameters 440E: Get Logical Drive Map Character I/O 01: Read Keyboard with Echo Memory management 48: Allocate Memory |
Program management 00: Terminate Program Networks 4409: Is Device Remote National language support (NLS) 38: Get/Set Country Information System Management 25:
Set Interrupt Vector File Sharing 440B: Set Sharing Retry Count |
DOS 1.0 00: Terminate Program |
DOS 2.0 1B: Get Default Drive Data |
DOS 3.0 4408: Check Removeable Media 5800: Get Allocation Strategy 5A: Create Temporary File DOS 3.1 4409: Is Device Remote DOS 3.2 440D: IOCTL for Block Devices 40: Set Device Parameters 440E: Get Logical Drive Map DOS 3.3 440C: IOCTL for Char. Devices 45: Set Iteration Count 4A: Select Code Page 6501: Get Extended Country Information 67: Set Maximum Handle Count DOS 4.0 440C: IOCTL for Char. Devices 5F: Set Display Mode 440D: IOCTL for Block Devices 46: Set Media ID 5D0A: Set Extended Error 6C: Extended Open/Create DOS 5.0 1F: Get Default DPB 440D: IOCTL for Block Devices 68: Sense Media Type 4410: Query IOCTL Handle |
Date
The system date. Make sure that you enter it in the correct format; normally
this is mm/dd/yy in North America, but may vary elsewhere.
Newer versions of Windows will let you change the
date within the built-in "Date/Time Properties" feature, and the BIOS
date will be updated automatically by the system.
Time
The system time. Most systems require this to be entered using a 24-hour
clock (1:00 pm = 13:00, etc.)
Newer versions of Windows will let you change the time within the built-in
"Date/Time Properties" feature, and the BIOS time will be updated
automatically by the system.
Daylight Savings
If your BIOS has this setting, enabling it will forward the time by one hour
on the first Sunday in April, and drop it back by one hour on the last Sunday
in October. The default value is usually "Enabled".
This setting is not present on most PCs; however, some operating systems, such as Windows
95, will do this for you automatically if you enable the daylight savings time
option in their control settings.
Note: The date
when daylight savings time "kicks in" can change in some cases; for
example, a few years ago the spring date changed from the last Sunday in April
to the first. If this happens again your BIOS will change the time on the wrong
date so you will want to disable this unless a flash BIOS upgrade
is made available to you that compensates.
IDE Primary Master
This is where the hard disk parameters are entered for the primary master
IDE/ATA device, the first drive in a modern IDE system. See the hard disk section
for details on what these terms mean and how these devices are set up. The
various settings for the drive are discussed in detail in the IDE Setup /
Autodetection section. The default setting for this on a system with IDE
autodetection, is usually "Auto".
Note: Some older
systems only have places for two drives' parameters to be entered; often in
this case they just call them "Drive C" and "Drive D".
IDE Primary Slave
This is where the hard disk parameters are entered for the primary slave IDE
device, the second drive in a modern IDE system. See the hard disk section
for details on what these terms mean and how these devices are set up. The
various settings for the drive are discussed in detail in the IDE Setup / Autodetection
section. The default setting for this on a system with IDE autodetection,
is usually "Auto".
IDE Secondary Master
This is where the hard disk parameters are entered for the secondary master
IDE device, normally the third drive in a modern IDE system (though it can be
the second as well, if the primary slave device is not used). See the hard disk section
for details on what these terms mean and how these devices are set up. The
various settings for the drive are discussed in detail in the IDE Setup /
Autodetection section. The default setting for this on a system with IDE
autodetection, is usually "Auto".
IDE Secondary Slave
This is where the hard disk parameters are entered for the secondary slave
IDE device, the fourth drive in a modern IDE system. See the hard disk section
for details on what these terms mean and how these devices are set up. The
various settings for the drive are discussed in detail in the IDE Setup /
Autodetection section. The default setting for this on a system with IDE
autodetection, is usually "Auto".
Floppy Drive A
The type of the first floppy drive. The choices
normally are:
- 1.44 MB:
A normal 3.5" drive.
- 1.2 MB:
A normal 5.25" drive.
- 2.88 MB:
A high-density 3.5" drive, found on some newer systems.
- 720 KB:
A low-density 3.5" drive.
- 360 KB:
A low-density 5.25" drive.
- None:
No floppy drive present in the "floppy A" position. May read
"not installed" or similar.
This setting usually defaults to a
1.44 MB 3.5" drive, the most common type currently in use.
Floppy Drive B
The type of the second floppy drive.
See common choices under "Floppy Drive A" above.
The type of the second floppy drive. The choices
normally are:
- 1.44 MB:
A normal 3.5" drive.
- 1.2 MB:
A normal 5.25" drive.
- 2.88 MB:
A high-density 3.5" drive, found on some newer systems.
- 720 KB:
A low-density 3.5" drive.
- 360 KB:
A low-density 5.25" drive.
- None:
No floppy drive present in the "floppy A" position. May read
"not installed" or similar.
This setting usually defaults to
"none" or "not installed" since most PCs don't have a
second floppy drive.
Video Display Type
This is the standard type of the display you are using; almost always this
should be set to either "VGA" or "VGA/EGA" for a modern PC,
if you are using any sort of VGA or SVGA card (which is basically every PC made
in the nineties.) This is also usually the default value.
Halt On
Some PCs give you the ability to
tell the BIOS specifically which types of errors will halt the computer during
the power-on self
test section of the boot process.
Using this, you can tell the PC to ignore certain types of errors; common
settings for this parameter are:
- All Errors:
The boot process will halt on all errors. You will be prompted for action
in the event of recoverable errors. This is the normally the default setting,
and is also the recommended one.
- No Errors:
The POST will not stop of any type of error. Not recommended except for
very special cases.
- All But Keyboard:
The boot process will stop for any error except a keyboard error. This can
be useful for setting up a machine without a keyboard, for example for a file or
print server.
- All But Diskette/Floppy: All errors will halt the system except diskette
errors. In my opinion, if your floppy drive has recurring and known
problems, it is most likely best just to replace (or disconnect) the drive
rather than using this.
Warning: Telling the system not to halt for any error types is
generally not wise. You may end up missing a problem with your system that you
will want to know about.
Comments
Post a Comment