Any CPU is designed from the ground up to support a particular instruction set. An instruction set is a set of hard-coded operations that the CPU can perform. These operations can be, for example, adding two numbers together, jumping to a different part of the program, or comparing two values. Each operation a computer can perform is uniquely represented by an opcode.
Opcodes
When executing a program, the CPU uses a program counter to track which instruction needs to be performed next. When an instruction is fetched, the program counter is incremented by the length of the instruction so that it points to the start of the next instruction. Each instruction consists of an opcode. Depending on the instruction, it may or may not include operands. Operands may be a constant value or a pointer to the location of a value in the CPU register or in system RAM.
Once the complete instruction has been fetched, it needs to be decoded. This is the process where the CPU separates the opcode and any operands. The decoded opcode is used to enable or disable specific electrical pathways in the CPU that will result in the correct operation.
Once the instruction has been decoded, it will be executed. The exact behavior of the CPU will depend on the operation. An addition operation will sum two values together. A jump operation will calculate where in the program to jump to. A compare operation will compare two values. A NOP operation will sit idle, as NOP stands for No Operation.
Most instructions will then output the result of the operation. This output can go to the processor registers and, if necessary, to system RAM. Each of these operations takes a single clock cycle to complete.
Illegal Opcodes
Each CPU architecture has its specific list of opcodes published by the manufacturer. The values of these opcodes aren’t necessarily the same cross-platform, which is why software needs to be compiled for different architectures. In some cases, the manufacturer also includes undocumented opcodes. These are referred to as “illegal opcodes.” Illegal opcodes, while undocumented, will perform the same function every time they’re called. As undocumented and non-standard features, though, updates to the CPU architecture can simply remove them.
Some early computer games on the Apple II relied on specific illegal opcodes. They then suffered performance and stability issues on the later Apple IIc CPU revision as the IIc removed the illegal opcodes the games required. Illegal opcodes were also used in copyright protection circles as a security method through obscurity in their fight against pirates cracking their content. Some illegal opcodes are simply meant as debugging tools and error handlers.
The x86 instruction set contains a large number of undocumented illegal opcodes. Interestingly, some of these are shared between Intel and AMD CPUs, indicating that both companies are publicly aware of their purpose while undocumented.
Compilers and Assembly
Most programs are written in high-level languages. These are relatively easy to read, often using English words or shorthand to minimize learning curves. For a computer to execute these programs, they need to be compiled. A compiler is basically a translator. It takes the high-level code and converts it into computer code, the instructions that the CPU can understand.
It’s also possible, in some languages, to run uncompiled code through a previously compiled program that generates machine code on the fly. Assembly is a low-level programming language that uses shorthand to allow developers direct visibility and control over the operations performed. NOP is an example of assembly shorthand.
Conclusion
An instruction set is a list of official functions that a CPU architecture can run. It is a list of operations that can be performed. These operations are hard-coded into the CPU and called using their respective opcodes.
Software generally uses a compiler to translate from human-readable high-level code to the machine code that the CPU can read. Sometimes, a CPU architecture can have undocumented opcodes, called illegal opcodes. Illegal opcodes are technically part of the instruction set. However, they may not be reliably available in future platform iterations. Don’t forget to leave your thoughts in the comments below.