Since the mid-noughties, desktop CPUs have been offering multiple CPU cores in a single package. This is a multicore processor. While early designs were limited to two or four CPU cores, modern CPUs offer up to 64 physical cores on a single CPU. Core counts that high aren’t standard for desktop CPUs and are generally reserved for high-end workstations or servers. Typical core counts in modern desktop CPUs are between 4 and 16. But what is it about multicore CPUs that make them dominant in modern computers?
Historically, a single core CPU was limited to only performing a single task at once. This comes with a whole range of issues. For example, on a modern computer, there are a huge amount of background processes running. If a CPU can only process one thing at a time, it means that these background processes must take processing time away from the foreground process. Additionally, cache misses mean that data needs to be retrieved from – comparatively – slow RAM. During the time that data is being fetched from RAM, the processor simply sits idle, as it can’t do anything until it gets the data. This holds up the running process as well as any other processes that are waiting for it to complete.
While modern single-core processors aren’t really a thing thanks to the rise of budget multicore CPUs, they would be able to use other modern tricks in order to operate faster. A pipeline would allow each different part of handling an instruction to be used simultaneously, providing a significant performance boost over using just one stage of the pipeline at all per clock cycle. A wide pipeline would see multiple instructions being able to be handled in each pipeline stage per clock cycle. Out Of Order processing would allow instructions to be scheduled in a more time-efficient manner. A branch predictor would be able to predict the outcome of a branching instruction and pre-emptively run the presumed answer.
All of these factors would work well and provide some performance. Adding one or more cores, however, allows all of that, and at a stroke enables the processing of twice the data at once.
Adding a second core sounds like it should double the raw performance. Things are, unfortunately, more complicated than that. Program logic is often single-threaded meaning that there is only one thing that a program tries to do at any one time. What can happen, however, is that other processes can use the other core at the same time. While there is no inherent performance boost to most individual programs, the provision of an extra processing resource, effectively reduces competition for a limited resource, which does provide a performance boost. This performance boost, simply from reducing competition for CPU time is most noticeable when jumping from a single to a dual-core CPU, there are decreasing returns from increasing the core count further, though more is generally better.
To take proper advantage of multicore systems and actually see a solid performance boost, programs need to be programmed to use multiple processing threads. Multithreaded logic is notoriously difficult to do reliably as it is often difficult to learn and there are many potential pitfalls. One example pitfall is known as a race condition. In a race condition one process assumes that another process that it starts will run smoothly, it then tries to do something that relies on that other process having run smoothly. For example, imagine a process starts another process to close one document and open another. If the original process doesn’t properly check if the second process has been completed, this can result in unexpected outcomes. If there was an issue closing the first document, for example, it might still be open when the original process just writes more data to it.
One of the biggest issues multicore processors end up struggling with is heat. While one CPU core doesn’t output that much heat, two give off more. In high-core count CPUs, this concentration of heat can result in a lower boost clock, as the CPU manages its temperature. A lower boost clock will cause lower performance in single-threaded applications. This can often be seen in gaming performance benchmarks. Video games are often highly reliant on a single thread. As such, single-threaded performance is often critical for gaming. High-core count CPUs like the 16-core count models are often from high-performance bins. Despite this, they can regularly be found to be outperformed by “lesser” CPUs with a lower core count in single-threaded benchmarks. This issue is even more obvious in ultra-high core count CPUs like the 64-core AMD Threadripper where the clock speed is noticeably lower than high-end desktop CPUs.
Many applications are able to make proper use of multiple CPU cores. For example, CPU rendering is a relatively easy task to parallelise. Performance improvements can be seen all the way up to 64 cores, and higher, though no single CPU offers more than 64 cores currently. Many applications simply can’t be multithreaded as they are reliant on sequential logic. While these don’t see anywhere near the speed up of a multithreaded program, the fact that multithreaded programs and other single-threaded programs can use other CPU cores does free up processor time, allowing for better performance.
In desktop processors, each CPU core within a multicore CPU has generally been identical. This homogeneity makes scheduling work on the cores simple. Using the same repeating design also helps to keep development costs down. Mobile processors, however, have long been using heterogeneous core architectures. In this design, there are two or even three tiers of CPU core. Each tier can run the same processes, however, some are designed for power efficiency, and others are tuned for performance. This has proven a recipe for success for battery-powered devices as many tasks can use the slower more power-efficient cores, increasing battery life, while high-priority processes can still be run at high speed when needed.
Desktop CPU architecture is also moving in the direction of a heterogenous core design. Intel’s Alder Lake 12th generation Core CPU line is the first desktop CPU to do this. In this case, the main driving factor of the smaller cores isn’t necessarily power efficiency but thermal efficiency, though those are two sides of the same coin. Having multiple powerful cores provides high performance, while many efficient cores can handle background tasks without affecting the main cores too much.
A multicore CPU is a CPU that features multiple processing cores in a single package, often, though not exclusively on the same die. Multicore CPUs don’t offer much of a direct performance boost to many programs, however, by increasing the number of cores, single-threaded programs don’t need to compete as much for CPU time. Some programs can take full advantage of multiple cores, making direct use of as many as are available. This provides a large performance boost, though due to thermal and power constraints this boost is not necessarily a straight performance doubling with a doubling of cores.
Did this help? Let us know!