Table of Contents
Introduction
As computers get modern, diverse range of memory types each of which is optimized for specific functions and ensuring efficient processing and storage of data so memory hierarchy is designed to balance speed ,cost and capacity. Nearly 5 to 6 types of memory play a crucial role to make a modern computer. Computer memory can be categorized as 1. processor memory, 2. Main memory, 3. Storage Memory, 4. Graphics memory, 5.Virtual and extended memory, 6. Specialized non volatile memory. In processor memory there are two parts Registers and Cache memory. In this blog we will cover details of Registers. Whether you’re a student, an enthusiast, or a professional hardware designer, understanding CPU registers is key to grasping how computers truly function at the lowest level.
Processor Memory
There are mainly two types of processor memory first is Register memory and second is Cache memory. Registers are small and ultra fast memories which are built directly into the CPU e.g. intel CPU typically classified as a Von Neumann architecture machine, in which CPU, memory(Registers) and I/O devices are three main components and all come together. These registers store data the processor is currently working with, like numbers, address, etc for quick operations.
As ana example we will take intel core i9 to understand registers better. Register in a intel core i9 processor divided as per specific needs like General purpose registers, segment registers, floating point register, SIMD registers, instruction pointer, flag register, control registers, debug registers.
General Purpose Register
This register can’t be used when ever and where ever you want, but they have different purpose. There are 16 general purpose register in intel core i9 those are
RAX – This register is of size 64 bit or 8 byte used as accumulator, in arithmetic operation, calculation , store results. Commonly used in retune values for functions.
RBX – This register is of size 64 bit or 8 byte used as base register. Used to store pointer address or base address.
RCX – This register is of size 64 bit or 8 byte used as counter register to count loop.
RDX – This register is of size 64 bit or 8 byte used as data register, storing data in I/O operation, it is typically holds the second operand in arithmetic operation.
RSI – This register is of size 64 bit or 8 byte used as source index, used for string operation and source pointer in memory operations.
RDI – This register is of size 64 bit or 8 byte used as destination index. used for string operation and destination pointer in memory operation.
RBP – This register is of size 64 bit or 8 byte used as base pointer, used for stack pointer for function call and local variable referencing in stack frames.
RSP – This register is of size 64 bit or 8 byte used as stack pointer, it points to the top of stack. It is doing all the pushing popping values.
R8 to R15 – These registers are of size 64 bit or 8 byte each used to provide more storage for calculations and function argument passing.
So all total \(16 \times 64 bit(8 byte) = 128 bytes\) of general purpose register .
Segment Register
Segment registers are used minimal but they are used for legacy programs operating system management and stack management.
CS – This register is 16 bit or 2 byte in size. It is used for code segment where it points to a location where executable codes were held.
DS – This register is 16 bit or 2 byte in size. This is point to data segment in memory and generally used for global variables.
ES – This register is 16 bit or 2 byte in size. This is extra segment used for string operation and additional data storage.
FS – This register is 16 bit or 2 byte in size. This is used for operating system(OS) specific purpose.
GS -This register is 16 bit or 2 byte in size. This is similar to FS and used operating system specific data.
SS – This register is 16 bit or 2 byte in size. It points to stack segment which is used to store function returned addresses and local variables.
All total \(6 \times 2 byte = 12 bytes\) of segment register.
Floating Point Register
FPU registers are there for extended precision floating point arithmetic. As we know for extended precision floating point operation it is 80 bit as per IEEE 754 standard. We have explained single precision in this blog, which is similar and easy to understand extended precision. Here ST(0) to ST(7) register dedicated to FPU which is of 80 bits. It is mainly used for trigonometric calculation, scientific computation etc. Another type of register come with FPU combined i.e. MMX register whose size is 64 bit from MM(0) to MM(7). It is used for multimedia application for integer vector processing. These two types of register are largely replaced by SIMD registers but these are still here in a processor. 8 FPU register each 80 bit and 8 MMX register each of 64 bits so all total 160 bytes.
SIMD Register
SIMD stand for single instruction multiple data. This registers used to perform parallel processing with a sinlge instruction. It is useful to perform 3D rendering, cryptography, machine learning etc.
XMM0 to XMM15 registers are called SSE register and it is used for vector processing of 4 single precision floats or 2 double precision float . Its size 128 bit.
YMM0 to YMM15 registers are called AVX register and it is used for vector processing of 8 single precision floats or 4 double precision floats. It is of size 256 bits.
ZMM0 to ZMM31 register is called AVX 512 register and used for vector processing of 16 single precision floats or 8 double precision float. It is of size 512 bits.
Xmm0 to XMM15 and YMM0 to YMM15 are not separate registers it is actually the same physical register where these registers are reside in ZMM0 to ZMM31 full width register so total size is \(32 \times 64bytes(512 bits) = 2048 bytes\).
Instruction Pointer
Register named RIP is used as instruction pointer. Its size is 64 bit(8 bytes) . It is mainly used to store the memory address of the next instruction to be exicuted. The value of RIP is updated to point to the next instruction after each instruction is fetched. It can reference and handle 64 bit wide addresses in a vertual memory space.
Flag Register
Flag registers also called RFLAGS. It contain different status flags which are used to indicate results of operations and control some aspects of program exicution. It contain multiple individual flags and in total of size 64 bits(8 bytes) in a intel core i9 processor.
CF is stand for carry flag, when it is set (set means 1 or ON ) an arithmetic operation causes a carry out of the most significant bit (MSB), this happens when unsigned addition overflows. Also CF is set when a subtraction operation requires borrowing. Otherwise when a addition don’t have carry and subtraction don’t require a borrow, CF flag is clear ( clear means 0 or OFF ).
PF is stand for parity flag and it is set when number of set bits in the result are even.
AF is stand for auxiliary carry flag, it is used in binary coded decimal arithmetic, it is set when there is a carry from bit 3 to bit 4.
ZF is stand for zero flag, it is set when result of operation is zero.
SF is stand for sign flag, it is set if result is negative i.e. MSB is set.
TF is stand for trap flag, this allows CPU to stop after each instruction and let a debugger take control.
IF is stand for interrupt flag, it is set when interrupt is enabled and cleared when interrupt is disabled.
DF is stand for direction flag, it is used to determine the direction of string operations like whether the pointer increments or decrements.
OF is stand for overflow flag, it is set if an arithmetic operation results in an overflow.
ID is stand for identification flag, it is set to enable the CPUID, which provides processor information.
IOPL is stand for I/O privilege level, it is specifies current privilege level for I/O operations, there are 4 privilege level 0 to 3. Level 0 is called kernel mode in this a program has full control over system hardware. It is used by programs like operating system and device drivers e.g. Linux kernel, windows kernel etc.
Level 1 is called privileged mode where a program has limited access over system hardware. Level 2 is called restricted mode where program has less access to system hardware. Level 3 is called user mode here no direct access to system hardware e.g. web browsers, games etc. level 1 and 2 is rarely used. There is another level called level -1 is used in modern computers where system allows multiple operating system to run on one machine without interfering with each other. It is called virtualization or hypervisors.
NT is stand for nested task flag. It is used for task switching, indicating the task is nested within another task.
RF is stand for resume flag, it is used in debugging and it allows resumption of normal execution after a break point is hit.
VM is stand for virtual 8086 mode flag, where it allows the system to run 16 bit programs in a protected environment.
Control And Debug Registers
Control register start from CR0, CR2, CR3, CR4, CR8 each 64 bit so total \(5 \times 8 bytes = 40 bytes\).
CR0 is used to enable or disable protected mode and cache control etc.
CR2 stores the address of the last page fault.
CR3 holds the page table base address for virtual memory translation.
CR4 controls various CPU features.
CR8 used to control task priority.
Debug register starts from DR0, DR1, DR2, DR3, DR6, DR7 each 64 bit so total \(6 \times 8 bytes = 48 bytes\).
DR0 to DR3 hold hardware break point addresses.
DR6 used as status register indicating which break point was hit.
DR7 used to controls which breakpoints are active and their types.
So in intel core i9 processor we have general purpose register 128 bytes + segment register 12 bytes + SIMD register 2048 bytes +floating point and MMX register 160 bytes + control and debug registers 88 bytes + instruction pointer 8 bytes +flag registers 8 bytes = 2452 bytes nearly 2.4 KB.
Conclusion
Registers are the fastest memory inside a CPU, acting as the first point of data access for executing instructions. Unlike cache or RAM, registers are directly embedded into the processor, ensuring near-instantaneous data retrieval. Operate at CPU clock speed, Hold crucial program control values, With SIMD and vector registers modern CPUs can execute multiple operations in one cycle, boosting performance. If you find it helpful, please write to us.