For many users, it felt like Microsoft had taken a step back when Vista was released. Just recently engineers began discussing the multiple core issues with Vista. As virtualization exploded, Vista suddenly became tasked with dealing with more than 64 cores.

Server admins have been dealing with this threat for sometime, however as quad-core processors began making their way into more and more desktop and laptop computing environments, more cores demanded more attention from the processor, slowing down system processes that were never designed to deal with mutliple core functionality.

Microsoft architects had not anticipated the operating system would be managing so many cores. So design trade-offs were made for Vista. Trade offs related to efficiencies that could have executed via more complex methods. Trade-offs for simplicity.

Windows 7 Scalability We Can Believe In?

One of the things Microsoft wanted from Windows 7 was a small, fast, battery-efficient operating system. So they made a huge effort from start to finish, to make sure that Windows 7 was fast and nimble.  Even though it provided more features. Windows 7 is the first version that has a smaller memory footprint than the previous release of Windows.

To overcome the Vista burden, Windows 7 had to present scalability that everyday users could see and appreciate.

Picking the Locks

The older versions of Windows kernel responsible for managing task scheduling was the dispatcher. It was protected by a global lock. This included things like thread priorities, queues and any object that might be required to wait on an event, I/O completion timers or asynchronous procedure calls.

The new Win7 and Windows Server 2008 R2 kernel has no dispatcher lock. It is completely gone. In its place is fine-grained locking, using 11 types of locks for the new task scheduler and rules for how locks may used to avoid deadlock.

Speeding Processes Up by Putting Processors to Sleep

What Microsoft has found out is that multiple core processors work better when their Logical Processors(LP) can be put to sleep. Keeping processors busy reduces latency and active threads can be shipped to another LP. Keeping LPs awake put more of a load on the scheduler, which increases load on Core 0, which is where the scheduling activity used to take place.