Considered the most significant architectural shift by Intel in a generation, the 12th generation Alder Lake CPUs have been designed to provide massive boosts in system performance and efficiency compared to their predecessors.
Intel worked together with Microsoft to design Alder Lake and optimize it for Windows 11. They made several critical changes to create a hybrid architecture for increased performance and efficiency. One of the main components of this hybrid architecture is the P-Core (Performance Core). But what is a P-Core? And how does it work?
What is an Intel P-Core?
An Intel Performance Core (P-Core) is a high-performance CPU core based on the Golden Cove microarchitecture that processes heavy single-thread high throughput workloads to optimize performance and lower the latency of a system.
The P-Core is designed for speed, pushing the limits of low latency while handling single-threaded applications. This allows it to handle high-priority foreground tasks such as gaming, streaming, content creation, AI applications, etc. assigned by the operating system scheduler. Though it must also be stated that E-Cores are also capable of handling some of these foreground tasks if need be.
How do Intel P-Cores Work?
The P-Cores in Golden Cove architecture were designed to work seamlessly with the efficient cores (E-Cores) in the Gracemont microarchitecture to create a hybrid architecture that provides maximum performance and efficiency in both single and multitasking workloads with the P-Cores providing the single-threaded performance. In fact, within the same die area and power utilization, a single P-Core delivers a 50% single-threaded performance increase over an E-Core.
Because of this new hybrid technology’s nature, the P-Cores that handle the foreground tasks work in tandem with the E-Cores that control the background tasks, preserving system battery life and generating maximum system performance.
To ensure this happens as efficiently as possible, Intel developed a hardware scheduler called the thread director, which monitors the performance of each core and gives runtime feedback to the operating system (OS). The OS then schedules the tasks based on the information provided to it.
The Intel thread director ensures that the P-Cores are handling the most critical tasks at any given period. This is to optimize performance and reduce latency in the system. The thread director also ensures that when all the P-Cores are handling tasks at a given moment, and a more critical task is available for assignment, the least essential task running on the P-Core can be reassigned to the E-Core to create space for the task with higher priority.
Intel P-Core Architecture
With a vast architectural shift in x86 processors, Intel has designed Golden Cove Performance Cores with several modifications to enhance their efficiency and performance.
Intel also adopted the popularly known big little concept, which involves integrating the “big” P-Cores responsible for optimum system performance and the “little” E-Cores responsible for maximum efficiency.
In the Golden Cove architecture, the width of each performance core has been enhanced to process more instructions per cycle (IPC). By increasing the number of parallel decoders in the core from the 4 of Rocket Lake to 6, the decoders can now decode up to six instructions per cycle. Intel has also increased the micro-op processing per cycle from 6 to 8.
Moreover, the micro-op cache was increased from 2.25K to 4K resulting in increased depth of the front-end bandwidth leading to a higher hit rate. As well as a smarter code prefetch mechanism for improved branch predictor accuracy.
There is also the out-of-order engine which makes use of instruction cycles that would otherwise be wasted and dependency tracks the micro-operations within those instruction cycles. Intel increased the micro-op allocation blocks of the out-of-order engine from 5 to 6 and the execution ports from 10 to 12.
Those micro-operations are then sent to the improved Integer Execution Units (IEU) which now has an additional integer ALU for a total of 5 and Vector Execution Units (VEU) which implements a new Fast Adder (FADD) and FMA units that support FP16 data type for improved power efficiency and low latency of the P-Core.
Additional, notable improvements to the L1 cache include increased load and store buffer. Moreover, the L2 Cache increased from 512KB to 1.25 MB per P-Core for client products. Intel claims all these changes/improvements in the architecture of the Golden Cove P-Cores will bring about significantly better performance while also utilizing less power compared to previous generations.
Intel has designed, improved, and optimized the 12th generation Alder Lake processors to bring about a system that delivers optimum performance and efficiency utilizing both the P-Cores and E-Cores in tandem. Though the power consumption of these CPUs is still a bit high, which is somewhat strange given the shift to a hybrid architecture that should be more power-efficient. Nevertheless, these 12th gen Alder lake CPUs have proven to be quite an improvement in performance, when compared to previous generations.