The term x86 denotes a family of backward compatible instruction set architectures based on the Intel 8086 CPU. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel’s 8-bit based 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term x86 derived from the fact that early successors to the 8086 also had names ending in “86”.
Many additions and extensions have been added to the x86 instruction set over the years, almost consistently with full backward compatibility. The architecture has been implemented in processors from Intel, Cyrix, AMD, VIA and many other companies.
The term is not synonymous with IBM PC compatibility as this implies a multitude of other computer hardware; embedded systems as well as general-purpose computers used x86 chips before the PC-compatible market started,some of them before the IBM PC itself.
Marketed as source compatible, the 8086 was designed to allow assembly language for the 8008, 8080, or 8085 to be automatically converted into equivalent (sub-optimal) 8086 source code, with little or no hand-editing. The programming model and instruction set was (loosely) based on the 8080 in order to make this possible. However, the 8086 design was expanded to support full 16-bit processing, instead of the fairly basic 16-bit capabilities of the 8080/8085.
There have been several attempts, including by Intel itself, to end the market dominance of the “inelegant” x86 architecture designed directly from the first simple 8-bit microprocessors. Examples of this are the iAPX 432 (alias Intel 8800), the Intel 960, Intel 860 and the Intel/Hewlett-Packard Itanium architecture. However, the continuous refinement of x86 microarchitectures, circuitry and semiconductor manufacturing would make it hard to replace x86 in many segments. AMD’s 64-bit extension of x86 (which Intel eventually responded to with a compatible design) and the scalability of x86 chips such as the eight-core Intel Xeon and 12-core AMD Opteron is underlining x86 as an example of how continuous refinement of established industry standards can resist the competition from completely new architectures
The x86 architecture was first used for the Intel 8086 Central Processing Unit (CPU) released during 1978, a fully 16-bit design based on the earlier 8-bit based 8008 and 8080. Although not binary compatible, it was designed to allow assembly language programs written for these processors (as well as the contemporary 8085) to be mechanically translated into equivalent 8086 assembly. This made the new processor a tempting software migration route for many customers. However, the 16-bit external databus of the 8086 implied fairly significant hardware redesign, as well as other complications and expenses. To address this obstacle, Intel introduced the almost identical 8088, basically an 8086 with an 8-bit external databus that permitted simpler printed circuit boards and demanded fewer (1-bit wide) DRAM chips; it was also more easily interfaced to already established (i.e. low-cost) 8-bit system and peripheral chips. Among other, non-technical factors, this contributed to IBM’s decision to design a home computer / personal computer based on the 8088, despite a presence of 16-bit microprocessors from Motorola, Zilog, and National Semiconductor (as well as several established 8-bit processors, which were also considered). The resulting IBM PC subsequently became preferred to Z80-based CP/M systems, Apple IIs, and other popular computers as the de facto standard for personal computers, thus enabling the 8088 and its successors to dominate this large part of the microprocessor market.
Extensions of word size
The instruction set architecture has twice been extended to a larger word size. In 1985, Intel released the 32-bit 80386 (later known as i386) which gradually replaced the earlier 16-bit chips in computers (although typically not in embedded systems) during the following years; this extended programming model was originally referred to as the i386 architecture (like its first implementation) but Intel later dubbed it IA-32 when introducing its (unrelated) IA-64 architecture. In 1999-2003, AMD extended this 32-bit architecture to 64 bits and referred to it as x86-64in early documents and later as AMD64. Intel soon adopted AMD’s architectural extensions under the name IA-32e which was later renamed EM64T and finally Intel 64. Among these five names, the original x86-64 is probably the most commonly used, although Microsoft and Sun Microsystems also use the term x64.
Basic properties of the architecture
The x86 architecture is a variable instruction length, primarily “CISC” design with emphasis on backward compatibility. The instruction set is not typical CISC, however, but basically an extended version of the simple eight-bit 8008 and 8080 architectures. Byte-addressing is enabled and words are stored in memory with little-endian byte order. Memory access to unaligned addresses is allowed for all valid word sizes. The largest native size for integer arithmetic and memory addresses (or offsets) is 16, 32 or 64 bits depending on architecture generation (newer processors include direct support for smaller integers as well). Multiple scalar values can be handled simultaneously via the SIMD unit present in later generations, as described below. Immediate addressing offsets and immediate data may be expressed as 8-bit quantities for the frequently occurring cases or contexts where a -128..127 range is enough. Typical instructions are therefore 2 or 3 bytes in length (although some are much longer, and some are single-byte).
To further conserve encoding space, most registers are expressed in opcodes using three bits, and at most one operand to an instruction can be a memory location (some “CISC” designs, such as the PDP-11, may use two). However, this memory operand may also be thedestination (or a combined source and destination), while the other operand, the source, can be either register or immediate. Among other factors, this contributes to a code size that rivals eight-bit machines and enables efficient use of instruction cache memory. The relatively small number of general registers (also inherited from its 8-bit ancestors) has made register-relative addressing (using small immediate offsets) an important method of accessing operands, especially on the stack. Much work has therefore been invested in making such accesses as fast as register accesses, i.e. a one cycle instruction throughput, in most circumstances where the accessed data is available in the top-level cache.
During execution, current x86 processors employ a few extra decoding steps to split most instructions into smaller pieces (micro-operations). These are then handed to a control unit that buffers and schedules them in compliance with x86-semantics so that they can be executed, partly in parallel, by one of several (more or less specialized) execution units. These modern x86 designs are thus superscalar, and also capable of out of order and speculative execution (via register renaming), which means they may execute multiple (partial or complete) x86 instructions simultaneously, and not necessarily in the same order as given in the instruction stream.
When introduced, in the mid-1990s, this method was sometimes referred to as a “RISC core” or as “RISC translation”, partly for marketing reasons, but also because these micro-operations share some properties with certain types of RISC instructions. However, traditionalmicrocode (used since the 1950s) also inherently shares many of the same properties; the new method differs mainly in that the translation to micro-operations now occurs asynchronously. Not having to synchronize the execution units with the decode steps opens up possibilities for more analysis of the (buffered) code stream, and therefore permits detection of operations that can be performed in parallel, simultaneously feeding more than one execution unit.
The latest processors also do the opposite when appropriate; they combine certain x86 sequences (such as a compare followed by a conditional jump) into a more complex micro-op which fits the execution model better and thus can be executed faster or with less machine resources involved.
Another way to try to improve performance is to cache the decoded micro-operations, so the processor can directly access the decoded micro-operations from a special cache, instead of decoding them again. Intel followed this approach with the Execution Trace Cache feature in their NetBurst Microarchitecture (for Pentium 4 processors) and later in the Decoded Stream Buffer (for Core-branded processors since Sandy Bridge).
Transmeta used a completely different method in their x86 compatible CPUs. They used just-in-time translation to convert x86 instructions to the CPU’s native VLIW instruction set. Transmeta argued that their approach allows for more power efficient designs since the CPU can forgo the complicated decode step of more traditional x86 implementations.
Starting with the AMD Opteron processor, the x86 architecture extended the 32-bit registers into 64-bit registers in a way similar to how the 16 to 32-bit extension took place. An R-prefix identifies the 64-bit registers (RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP, RFLAGS, RIP), and eight additional 64-bit general registers (R8-R15) were also introduced in the creation of x86-64. However, these extensions are only usable in 64-bit mode, which is one of the two modes only available in long mode. The addressing modes were not dramatically changed from 32-bit mode, except that addressing was extended to 64 bits, virtual addresses are now sign extended to 64 bits (in order to disallow mode bits in virtual addresses), and other selector details were dramatically reduced. In addition, an addressing mode was added to allow memory references relative to RIP (the instruction pointer), to ease the implementation of position-independent code, used in shared libraries in some operating systems.
Main article: x86-64
See also: Itanium
In April 2003, AMD released the first x86 processor with 64-bit physical memory address registers capable of addressing much more than 4 GB of memory using the new x86-64 extension (also known as x64). Intel introduced its first x86-64 processor in July 2004.
x86-64 had been preceded by another architecture employing 64-bit memory addressing: Intel introduced Itanium in 2001 for the high-performance computing market. However, Itanium was incompatible with x86 and is less widely used today. x86-64 also introduced the NX bit, which offers some protection against security bugs caused by buffer overruns.
Ref: http://en.wikipedia.org/wiki/Intel_8086 and http://en.wikipedia.org/wiki/X86