8-core Mac
Pro vs 4-core Mac Pro—compute intensive tasksPERMALINK
See the discussion below on why the 8-core Mac Pro is often no
faster than the 4-core Mac Pro (or even slightly slower).
Rob at barefeats.com ran diglloydTools test-compute-speed,
a CPU-intensive benchmark that assesses the scalability of multiple processor cores. The results show that the 8-core
Mac Pro can be twice as fast as the 4-core Mac Pro with compute-intensive tasks:
| 4-core Mac Pro @ 3.0GHz |
8-core Mac Pro @ 3.0GHz |
Test size = 128MB...
Chunk Size K MB/sec
1 138.8
2 144.4
4 147.5
8 149.1
16 149.9
32 149.9
64 149.1
128 149.1
256 149.2
512 149.2
1024 149.2
Best chunk size: 16K
Testing using 4 threads...
thread 0: 147.6...150.2
thread 1: 148.5...150.1
thread 2: 148.1...150.2
thread 3: 147.6...150.2
Aggregate rate: 601MB/sec
|
Test size = 128MB...
Chunk Size K MB/sec
1 138.9
2 144.9
4 148.1
8 149.7
16 150.5
32 150.3
64 148.9
128 149.0
256 149.1
512 149.0
1024 149.1
Best chunk size: 16K
Testing using 8 threads...
thread 0: 150.2...150.5
thread 1: 150.3...150.5
thread 2: 150.1...150.6
thread 3: 150.3...150.6
thread 4: 150.3...150.4
thread 5: 150.1...150.4
thread 6: 150.1...150.4
thread 7: 150.1...150.4
Aggregate rate: 1204MB/sec |
If you’re a scientist performing intensive calculations which don’t involve much memory or disk access, you’re
likely to see a huge benefit from the 8-core Mac Pro. But when working with images or video or sound, large amounts
of memory are accessed, and thus performance gains are likely to be modest, or even degraded in some cases.
At a 20% premium over the 3.0 GHz quad-core Mac Pro, and a 48% premium over the 2.66 GHz quad-core
Mac Pro, most users will still find the 2.66 GHz model to be the “sweet spot”.
8 core Mac Pro vs 4 core Mac ProPERMALINK
I predicted that an 8-core Mac Pro would be roughly
equivalent to a 6-core Mac Pro as compared with the 4-core model. First test
results at barefeats.com confirm this prediction, showing that the 8-core Mac Pro doesn’t scale in performance as
might be assumed by a layman, but can be about 50% faster with certain tasks.
Many of the tests show no improvement in performance at all—even though all
8 cores are used. Why? As noted in April 5 blog entry, unless the task is compute-intensive,
memory bandwidth will limit the performance to about that of a 4-core Mac Pro, because all 8 cores will be trying to
access memory at the same time.
It’s not just memory bandwidth either—access to the hard disk and system services in effect
require tasks to queue up single-file to wait for the needed resource (memory access, disk, exclusive lock, etc). This
is called contention. Like a rush-hour freeway, doubling the number of cars can cut speed by more than
half as cars (tasks) must jostle with one another for the same lane space (system resource).
Contention and overhead — Operating systems (eg Mac OS X ) have to juggle
the outstanding tasks/applications/threads among the available processor cores as well as the operating system itself.
As the number of processor cores increases, this overhead also increases.
In the real world, multi-core systems do not scale linearly and the performance
gains of 8 cores over 4 cores are often modest, or even degraded.
Poor scheduling of tasks across processor cores is also responsible for performance degradation.
In particular, Mac OS X is fond of switching a task (thread) between cores for apparently no reason at all—I’ve observed
this myself on many occassions; with several idle cores, Mac OS X switches a task from one to another core at apparently
arbitrary intervals. Each time this happens, the large on-chip processor cache of the “old” core must be flushed to
main memory, which is very slow (relatively speaking). As the task resumes running, the on-chip cache must be populated
from main memory, again a slow operation in relative terms.
The core-swapping behavior can be seen below; two runs of diglloydTools run-stress-test were
done. The first run shows that the single-threaded portion of the test ran on the 4th core for its full duration. The
second run shows that the single thread ran on the 1st core for about half its run, then was swapped to the 2nd core—in
spite of the 3 other cores being idle! This had little effect under the circumstances, but can have a large effect if
there is contention for the CPU by multiple threads.

Mac OS X core swapping
It seems clear that Mac OS X 10.4 has not been adequately optimized for even 4 cores; perhaps
Mac OS X 10.5 will improve matters.