PCI Express

From Example Problems
Jump to navigation Jump to search
PCI Express slots (from top to bottom: x4, x16, x1 and x16), compared to a traditional 32-bit PCI slot (bottom)

PCI Express, or PCIe, (formerly known as 3GIO for 3rd Generation I/O, not to be mistaken for PCI-X) is an implementation of the PCI computer bus that uses existing PCI programming concepts, but bases it on a completely different and much faster serial physical-layer communications protocol. It is supported primarily by Intel, who started working on the standard as the Arapahoe project after pulling out of the InfiniBand system.

PCI Express is intended to be used as a local interconnect only. As it is based on the existing PCI system, cards and systems can be converted to PCI Express by changing the physical layer only – existing systems could be adapted to PCI Express without any change in software. The higher speeds on PCI Express allow it to replace almost all existing internal buses, including AGP and PCI, and Intel envisions a single PCI Express controller talking to all external devices, as opposed to the northbridge/southbridge solution in current machines.

Hardware protocol summary

The PCIe link is built around a bidirectional, serial (1-bit), point-to-point connection known as a "lane". This is in sharp contrast to the PCI connection, which is a bus-based system where all the devices share the same unidirectional, 32-bit, parallel bus.

At the electrical level, each lane utilizes two unidirectional low voltage differential signaling (LVDS) pairs at 2.5 gigabaud. Transmit and receive are separate diff-pairs, for a total of 4 data wires per lane.

A connection between any two PCIe devices is known as a "link", and is built up from a collection of 1 or more lanes. All devices must minimally support single-lane (x1) links. Devices may optionally support wider links composed of 2, 4, 8, 12, 16, or 32 lanes. This allows for very good compatibility in two ways. A PCIe card will physically fit (and work correctly) in any slot that is at least as large as it is (e.g. an x1 card will work in an x4 or x16 slot), and a slot of a large physical size (e.g. x16) can be wired electrically with fewer lanes (e.g. x1 or x8; however it must still provide the power and ground connections required by the larger physical slot size). In both cases, the PCIe link will negotiate the highest mutually supported number of lanes.

File:Pci express.PNG
Images of PCI-Express x1, x4, x8, and x16

PCIe sends all control messages, including interrupts, over the same links used for data. The serial protocol can never be blocked, so latency comparable to PCI (which has dedicated interrupt lines) can be maintained.

Data transmitted on multiple lane links is interleaved, meaning that each successive byte is sent down successive lanes. The PCIe specification refers to this interleaving as "data striping". While requiring significant hardware complexity to synchronize (or deskew) the incoming striped data, striping can significantly increase the data-throughput of the link. (Due to packet protocol rules, striping may not necessarily reduce the latency of small data packets on a link.)

As with all high-speed serial transmission protocols, clocking information must be embedded in the signal. At the physical level, PCI Express utilizes the very common 8B/10B encoding scheme to ensure that long strings of ones or zeros are broken up enough that the receiver doesn't lose track of where the bit edges are. This coding scheme replaces 8 uncoded (payload) bits of data with 10 (encoded) bits of transmitted data, consuming 20% of the overall electrical bandwidth.

Some other protocols (such as SONET) use a different form of encoding known as "scrambling" to embed clock information into data streams. The PCI Express specification also defines a scrambling algorithm, but its form of scrambling is not to be confused with the scrambling included in SONET. Rather than embedding clock information, the scrambling in PCI Express is designed to prevent repeating data patterns in the transmitted data stream from causing RF emission peaks.

First-generation PCIe is constrained to a single signalling-rate of 2.5 Gigabits/s. PCI-SIG plans future versions adding signalling rates of 5 and 10 Gigabit/s.

First-generation PCIe is often quoted to support a data-rate of 250 MB/s in each direction, per lane. This figure is a calculation from the physical signalling-rate (2500 Mbaud) divided by the encoding overhead (10bits/byte.) This means a 16 lane (x16) PCIe card would then be theoretically capable of 250 * 16 = 4000 MB/s in each direction. While this is correct in terms of data bytes, more meaningful calculations will be based on the usable data-payload rate which depends on the profile of the traffic, which is a function of the high-level (software) application and intermediate protocol levels.

Like other high-speed serial interconnect systems, PCIe has significant protocol and processing overhead. Long continuous unidirectional transfers (such as those typical in high-performance storage controllers) can approach >95% of PCIe's raw (channel) data-rate. These transfers also benefit the most from increased number of lanes (x2, x4, etc.) But in more typical applications (such as a USB or ethernet controller), the traffic profile is characterized as short data-packets with frequent enforced acknowledgements. This type of traffic reduces the efficiency of the link, due to overhead from packet-parsing and forced-interrupts (either in the device's host-interface or the PC's CPU.) This is by no means a criticism of the PCIe specification -- merely a reminder that PCIe is not exempt to the rules of a layered packet protocol, no more so than other comparable high-speed serial technologies (such as Serial ATA and Fibre Channel).

Form factors

The following form factors have been specified for PCI Express devices:

  • Regular card
  • Low height card
  • Mini Card: a replacement for the Mini PCI form factor (with x1 PCIe, USB 2.0 and SMBus buses on the connector)
  • ExpressCard: similar to the PCMCIA form factor, (with x1 PCIe and USB 2.0; hot-pluggable)
  • AdvancedTCA: a replacement for CompactPCI

Competing protocols

Several communications standards have emerged based on high speed serial architectures. These include but are not limited to HyperTransport, InfiniBand, RapidIO, and StarFabric. There are industry proponents of each, and since significant funds have been invested in their development, each consortium tends to emphasize the advantages of its variant over others.

Essentially the differences are based on the tradeoffs between flexibility and extensibility vs. latency and overhead. An example of such a tradeoff is adding complex header information to a transmitted packet to allow for complex routing (PCI Express is not capable of this). This additional overhead reduces the effective bandwidth of the interface and complicates bus discovery and initialization software. Also making the system hot-pluggable requires that software track network topology changes. Examples of buses suited for this purpose are InfiniBand and StarFabric.

Another example is making the packets shorter to decrease latency (as is required if a bus is to be operated as a memory interface). Smaller packets mean that the packet headers consume a higher percentage of the packet, thus decreasing the effective bandwidth. Examples of bus protocols designed for this purpose are RapidIO and HyperTransport.

PCI Express falls somewhere in the middle, targeted by design as a system interconnect (local bus) rather than a device interconnect or routed network protocol. Additionally, its design goal of software transparency constrains the protocol and raises its latency somewhat.


File:PCI-E Video.jpg
An nVidia GeForce 6600GT PCI-Express video adapter card

As of 2005, PCI Express appears to be well on its way to becoming the new backplane standard in personal computers. There are several explanations for this, but the principal reason is it was designed to be completely transparent to software developers (an operating system designed for PCI can boot in a PCI Express system without any code modification). Other secondary reasons include its enhanced performance and strong brand recognition.

All new graphics cards from both ATI Technologies and NVIDIA use PCI Express. Most new Gigabit Ethernet chips and some 802.11 wireless chips also use PCI Express. Other hardware such as RAID controllers and network cards are also starting to make the switch.

In 2005 Apple updated both the consumer iMac and workstation PowerMac to use PCI Express exclusively, supplanting the AGP and PCI-X connectivity they had before.

ExpressCard is just starting to emerge on laptops. The problem is many laptops only have one slot and it is impossible to give up the existing legacy Cardbus for the new ExpressCard slot. Desktops don't have this problem as they have multiple slots and can support PCI Express and the legacy PCI slots concurrently. The PCI express bus is currently being used by many companies like intel and their new chip.

External links

cs:PCI-Express de:PCI-Express es:PCI-Express fr:PCI Express he:PCI-Express hr:PCI Express it:PCI Express nl:PCI Express pl:PCI Express pt:PCI Express sk:PCI-Express sv:PCI Express zh:PCI Express