What is an Intel E-Core (Efficient Core)?

Intel has made one of if not their most significant architectural shift in recent years with their 12th-generation Alder Lake CPUs. Which have been designed to ensure that the system can handle both single and multi-threaded workloads efficiently with low latency.

To achieve this ideal system, Intel has introduced a hybrid architecture that utilizes efficient Cores (E-Core) as one of its main components along with Performance Core (P-Cores). But what is an E-Core? And how does it work?

What is an Intel E-Core?

An Intel Efficient Core (E-Core) is a power-saving CPU core based on the Gracemont microarchitecture that mainly handles lower priority multi-threaded workloads in a system. An E-Core is optimized for power and density efficient throughput which ensures high efficiency, scalability of cores, and lower power consumption.

Although this architectural upgrade has increased the performance of the E-Cores when handling single-thread tasks considerably when compared to previous generations, the operating system still prioritizes the Performance Core (P-Core) when assigning critical and heavy single-threaded workloads. Though E-Cores are vary capable of being assigned higher priority tasks if need be.

Examples of the lower priority background tasks that are handled by the Efficient Cores include email synchronization, calendar notification, date and time update, etc.

How do the Intel E-Cores Work?

Intel Efficient Cores (E-Cores) are designed for efficiency to tackle the lower priority background tasks. E-Cores were developed to excel in area-efficient multi-threaded performance. This allows for increased power efficiency, especially for mobile applications.

However, the E-Cores have also been given a boost when handling single-threaded tasks. Therefore, it is not uncommon in the dynamic Alder Lake architecture to see the operating system scheduler, Intel thread director, reassign a task that was given to the P-Core to the E-Core. This is because, given the same die area and power package, a single E-Core would deliver half of the single-threaded performance that a single P-Core can deliver in single-threaded applications.

In Alder Lake, the system depends on the ability of the P-Cores and E-Cores to work seamlessly together. It is this relationship that enables the system to handle both the foreground and background tasks efficiently. The relationship between the two cores cannot run seamlessly without a hardware scheduler known as the Intel thread director. The thread director monitors the performance of each core and relays the runtime information to the operating system.

The OS scheduler then schedules the tasks based on the information the thread director provided. Usually, the high-priority tasks are assigned to the P-Cores, while the E-Cores handle the low-priority tasks. This ensures the optimization of system performance and the preservation of battery power.

Intel E-Core Architecture

Intel unveiled a set of charts that offered computer lovers some insight into what the E-Corex86 architecture looks like. According to the charts, the space provided for one performance core is equivalent to the space taken up by four efficiency cores along with a 4MB L2 cache.

While the Gracemont Efficient Cores (E-Cores) retained the dual three wideouts of order decoders from the Tremont architecture for up to 6 instructions per cycle, the L1 instruction cache was increased to 64KB from the 48KB of Rocket Lake. Keeping in mind the instruction cache helps save a history of previous instructions decoded by the decoders in the E-Core. If a similar task is assigned to the E-Core, it can pull the solution directly from the instruction cache via an on-demand instruction length decoder to accelerate workloads, while saving time (lowering latency) and power.

Another upgrade in the Gracemont E-Core architecture is the increase in the out-of-order window buffer size from 208 to 256 that discovers data parallelism. In addition to 17 execution ports for increased parallel data execution but can only process five instructions per cycle (IPC) at the allocation stage and 8 IPC at the return stage. There are 6 types of data execution.

Here are graphics detailing both the latency and throughput performance increase of an E-Core compared to the previous Skylake core.

Final Thoughts

The Gracemont E-Cores were optimized to deliver maximum efficiency while mainly handling lower priority multi-threaded background tasks. Though they were also given a boost in the handling of single-threaded tasks. However, it is still up to the hardware scheduler and the Operating System scheduler to assign the right tasks to the right cores at the right moment. So far, these Alder lake CPUs have delivered a significant improvement in system performance, though their power consumption is also quite high.

Latest Articles

More Articles Like This