If you’re not particularly interested in computers, you’d be forgiven for assuming that the only forms of memory in a computer were the hard drive and the RAM. Those of you with an interest in computers would likely know that this isn’t the case, and that the CPU also has a set of caches that are used to cache data from the RAM so that the CPU can access it faster. All of these are advertised features, primarily because the speed and/or capacity is a decent selling point and generally affects performance levels.
There is actually one other layer of memory though. As much as you might think the L1 cache is as close to the actual processing core as possible, there is another higher layer in the memory hierarchy. These are the CPU registers. The reason these aren’t really advertised or mentioned is that they aren’t really changed at all. Technically, they could be, however, the number and size of registers is actually fundamental to the architecture. This means that all x86-64 CPUs have the same number of registers. They’re not marketed because they’re not a competing point.
What does a register do?
A register is a quickly available storage location for the processor. Access to a register is immediate with zero latency, whereas even the L1 cache has a roughly 4-5 cycle latency in modern CPUs. This immediacy of access hints at the use case for registers. Registers are used to store the instructions actively being operated on by the CPU. They also store data points that are to be processed. Some registers are general purpose, while others have a very specific purpose. An example of a special purpose register would be the program counter which is where the processor tracks its position in its program sequence.
Many registers are considered user accessible. This doesn’t mean that the user of a computer can choose what value to put in them though. It means that the running software can specify data to be loaded into these registers. A smaller number of registers are internal, meaning that software can’t address them at all. The instruction register, which stores the instruction currently being executed, is an example of an internal register.
Register renaming
While a CPU architecture may only allow for a single configuration of registers, there’s actually a bit of nuance to that. All modern CPUs make use of register renaming. This is a technique where you can have more physical registers and use them to pre-load data or store data related to an out-of-order instruction that would have otherwise been overwritten. When the CPU gets to the point that it needs the data in the extra registers it simply renames it so that it is addressable, at the same time as making a previously addressable register unaddressable.
The process of register renaming can be very helpful for out-of-order execution. For example, if a memory location is programmed to be read from then written to, and the instructions were executed in that order this is fine. If, however, the instructions are reordered to perform the write first, the read instruction would get the wrong value. To prevent this, the original value to be read is stored in an unused register which is then renamed when the relevant instruction is being processed.
Conclusion
Registers are the highest tier in the memory hierarchy. They are the only part directly addressable by the CPU and have no latency. Registers are used to store the data actively being executed by the CPU. They are also used to store other data points such as the program counter which keeps track of which instruction is the next one to be executed. Very limited numbers of registers are available with the x86-64 architecture having 16 general purpose registers and either 16 or 32 floating point registers depending on if AVX-512 is supported.
Did this help? Let us know!