Multi-core processor

A dual-core CPU combines two independent processors and their respective caches and cache controllers onto a single silicon chip, or integrated circuit. IBM's POWER4 was the first microprocessor to incorporate 2 cores on a single die. Various dual-core CPUs are being developed by companies such as Motorola, Intel and AMD, and began to appear in consumer products in 2005. This is an initial step in the development of many core computer architectures.

Some people think a true dual-core processor has two cores on a die (and the die is wrapped in a package). Some other people (like Intel) think a dual-core also includes a processor which has one core on each die, and the two dies are on the same package that plugged in one socket on the motherboard. Some call the latter multichip module, double core, or twin core, instead of dual-core. And in the discussion of multicore, the meaning of 'processor', 'CPU', 'chip' depends on the context; each may mean a core, a die, or a package.

Dual-core CPU technology first became a practical viability in 2001^[1] as 180-nm CMOS process technology became feasible for volume production. At this size, multiple copies of the largest microprocessor architectures could be incorporated onto a single production die. (Alternative uses of this newly available "real estate" include widening the bus and internal registers of existing CPU cores, or incorporating more high-speed cache memory on-chip.)

Commercial examples

International Business Machines (IBM)'s POWER4, released in 2000, was the first dual-core microprocessor on the market.
IBM's POWER5 dual-core chip is now in production, and the company has a PowerPC 970MP dual-core processor in production and is in use in the Apple PowerMac G5.
Intel released its dual-core desktop Pentium D x86 64-bit processors to OEMs on 12 April 2005 though they are not true dual cores, simply two dies on the same package. Its dual-core Xeon processors, code-named Paxville and Dempsey, are shipping at 3GHz. The company is also currently developing dual-core versions of its Itanium high-end server CPU architecture but there have been many delays.
AMD, Intel's chief rival, released its dual-core Opteron server/workstation processors on 22 April 2005, and its dual-core desktop processors, the Athlon 64 X2 family, were released on 31 May 2005.
Motorola/Freescale has dual-core ICs based on the PowerPC e600 and e700 cores in development.
Sun_Microsystems dual-core CPUs include the 1.05 - 1.35 GHz UltraSPARC IV and 1.5 GHz UltraSPARC IV+ models

Architectural class

The dual-core type of processor falls into the architectural class of a tightly-coupled multiprocessor. In this class, a processing unit, with an independent instruction stream executes code from a pool of shared memory. Contention for the memory as a resource is managed by arbitration and by the processing unit specific caches. The localized caches make the architecture viable since modern CPUs are highly optimized to maximize bandwidth to the memory interface. Without them, each CPU would run near 50% efficiency. Multiple caches into the same resource must be managed with a cache coherency protocol.

Beyond dual-core processors, there are examples of chips with multiple cores. Such chips include network processors which may have a large number of cores or microengines that may operate independently on different packet processing tasks within a networking application.

Development motivation

Technical pressures

As CMOS process technologies continue to shrink, the high end constraints on the complexity that can be placed on a single die move back. In terms of CPU designs, the choice becomes adding more functions to the device (e.g. an Ethernet controller, memory controller, or high-speed CPU cache), or adding complexity to increase CPU throughput. Generally speaking, shrinking the features on the IC also means that they can run at lower power and at a higher clock rate.

Various potential architectures contend for the additional "real estate" on the die. One option is to widen the registers and/or the bus interface of an existing processor architecture. Widening the bus interface alone leads to superscalar processor architectures, and widening both usually requires new programming models. Other options include including multiple levels of memory cache, and developing system-on-a-chip solutions.

Commercial incentives

Several business motives drive the development of dual-core architectures. Since multiple-CPU SMP designs have been long implemented using discrete CPUs, the issues regarding implementing the architecture and supporting it in software are well known. Additionally, utilizing a proven processing core design (e.g. Freescale's e700 core) without architectural changes reduces design risk significantly. Finally, the connotations of the terminology "dual-core" (and other multiples) lends itself to marketing efforts.

Additionally, for general-purpose processors, much of the motivation for dual-core processors comes from the increasing difficulty of improving processor performance by increasing the operating frequency (frequency-scaling). In order to continue delivering regular performance improvements for general-purpose processors, manufacturers such as Intel and AMD have turned to dual-core designs, sacrificing lower manufacturing costs for higher performance in some applications and systems.

It should be noted that while dual-core architectures are being developed, so are the alternatives. An especially strong contender for established markets is to integrate more peripheral functions into the chip.

Advantages

Proximity of two CPU cores on the same die have the advantage that the cache coherency circuitry can operate at a much higher clock rate than is possible if the signals have to travel off-chip, so combining equivalent CPUs on a single die significantly improves the performance of cache snoop operations.
Assuming that the die can fit into the package, physically, the dual-core CPU designs require much less PCB space than multi-chip SMP designs.
A dual-core processor uses slightly less power than two coupled single-core processors, principally because of the increased power required to drive signals external to the chip and because the smaller silicon process geometry allows the cores to operate at lower voltages.
In terms of competing technologies for the available silicon die area, the dual-core design can make use of proven CPU core library designs and produce a product with lower risk of design error than devising a new wider core design. Also, adding more cache suffers from diminishing returns.

Disadvantages

Dual-core processors require operating system (OS) support to make optimal use of the second computing resource.^[2] Also, making optimal use of multiprocessing in a desktop context requires application software support.
The higher integration of the dual-core chip drives the production yields down and are more difficult to manage thermally than lower density single-chip designs.
From an architectural point of view, ultimately, single CPU designs may make better use of the silicon surface area than multiprocessing cores, so a development commitment to this architecture may carry the risk of obsolescence.
Scaling efficiency is largely dependent on the application or problem set. For example, applications that require processing large amounts of data with low computer-overhead algorithms may find this architecture has an I/O bottleneck, underutilizing the device.

Licensing

Another issue that has surfaced in recent business development is the controversy over whether dual core processors should be treated as two separate CPUs for software licensing requirements. Typically enterprise server software is licensed per processor, and some software manufacturers feel that dual core processors, while a single CPU, should be treated as two processors and the customer should be charged for two licenses - one for each core. However, the trend seems to be counting dual-core chips as a single processor as Microsoft, IBM, Intel, and AMD support this view. Oracle counts AMD and Intel dual-core CPUs as a single processor but has other funny numbers for other types. IBM and Microsoft count a multi-chip-module as multiple processors. If multi-chip-modules counted as one processor then CPU makers would have an incentive to make large expensive multi-chip-modules so their customers saved on software licensing. So it seems like the industry is slowly heading towards counting each die as a processor, no matter how many cores each die has. Intel has released Paxville which is really a multi-chip-module but Intel is calling it a dual-core. It is not clear yet how licensing will work for Paxville. This is an unresolved and thorny issue for software companies and customers.

Notes

^ Digital signal processors, DSPs, have utilized dual-core architectures for much longer than high-end general purpose processors. A typical example of a DSP-specific implementation would be a combination of a RISC CPU and a DSP MPU. This allows for the design of products that require a general purpose processor for user interfaces and a DSP for real-time data processing; this type of design is suited to e.g. mobile phones.
^ Two types of operating systems are able to utilize a dual-CPU multiprocessor: partitioned multiprocessing and symmetric multiprocessing (SMP). In a partitioned architecture, each CPU boots into separate segments of physical memory and operate independently; in an SMP OS, processors work in a shared space, executing threads within the OS independently.

External links

IBM's Power4 Wins Microprocessor Report Technology Award – Microprocessor Watch, Issue #34, 27 January 2000
Dual-core chips bring dual caches – By Michael Kanellos, CNET News.com, 25 August 2004
Intel Third in race to Ship Dual Core – By Michael Singer, internetnews.com, 12 April 2005
AMD Announces World's First 64-Bit, x86 Multi-Core Processors For Servers And Workstations At Second-Anniversary Celebration Of AMD Opteron? Processor
AMD HyperTransport? Technology
Hyper-Threading Technology Intel
IntelR PentiumR 4 Processor Extreme Edition supporting Hyper-Threading Technology
Intel Celeron to gain improved AMD64 compatibility
Findings of a test carried out by Anandtech showed that dual-core chips produced by AMD and Intel had individual performance merits under different situations of application
Dual-core CPUs offer a high level of performance for an economical price

A multicore processor is a chip with more than one processing units (cores). Mostly, each unit can run multiple instructions at the same time and each has its own cache.

The limitations of single-processor architecture

High frequencies have an upper size limit: a 100GHz chip at 0.01ns per clock cycle limits the chip size to 3mm due to the speed of light (~300mm/ns).
Long pipeline introduces big penalty for mis-prediction/wrong speculation
Higher energy density increases localized heat output and makes cooling hard

Multicore architecture is a solution

A multicore architecture is actually a SMP implemented on a single VLSI circuit. The goal is to allow greater utilization of thread-level parallelism (TLP), especially for applications that lack sufficient instruction-level parallelism (ILP) to make good use of superscalar processors. Exploiting TLP at a chip level is usually called Chip-level multiprocessing (also known as CMP), or Chip-level multithreading (CMT)

The characteristics of a CMP system

A slow but wide approach: improve the throughput of the whole computer system.
- Good for transaction processing, database and scientific computing applications.
- No benefits for single application that cannot be parallelized (divided and run on several tasks or threads)
Better data locality than regular multi-processor architectures
Better communication between processing units
Saves space, saves energy.
Better cost/performance ratio than a single-core processor

Example multicore processors

The whole microprocessor industry is jumping into multicore today. The latest versions of most RISC architectures use CMP, including

PA-RISC (PA-8800),
IBM POWER (POWER4 and POWER5)
Sun Microsystems SPARC (UltraSPARC IV, UltraSPARC T1)
AMD Opteron (shipping in May of 2005)
Intel Pentium D (shipping in May 2005)
Cradle Technologies MDSP (CT3400) (CT3600)
Cavium Networks OCTEON (CN3XXX) (CN3XXX)

Other microprocessor families are also expected to use CMP in future versions.

Intel's Itanium is expected to do so in the middle of 2006, with a release codenamed Montecito; then even more extensively in 2007 with a product codenamed Tukwila.