A Click shows your site support to my Sponsors

Accelerate Your Mac! - the source for performance news and reviews
The Source for Mac Performance News and Reviews

XLR8 G4/350 Upgrade Tests: 1MB vs. 2MB Cache Sizes
By Mike
Published: 12/17/99
Does the extra 1MB of Backside Cache provide any benefit?

The question above has been one of the more common G4 inquiries I've had since it was announced the G4 supported up to 2MB of backside cache. The previous G3 CPU designed supported a maximum of 1MB, and many Mac owners wondered what the larger cache would offer in increased performance. As I'll show below, it seems larger caches show little benefit for most applications with a few exceptions. In some cases the default 'write-through' cache mode used for these tests may have leveled the playing field a bit. More on that later.

Although there are no definite plans for any retail 2MB cache size G4 CPU upgrades, XLR8 provided a 'Technology Demo" until for testing and comparison to the common 1MB backside cache model. Powerlogix early on had mentioned a 2MB G4 model but I was later told they were not planning any retail models due to a lack of any real benefit for the added cost.

XLR8 provided a 2MB sample to allow me to publicly show the results of independent tests to see what (if any) benefit the larger cache size would provide. I thank them for allowing me to put this issue to rest in all our minds.

First let me define the test system:

  • Apple PowerMac 9600/350
  • 320MB interleaved RAM
  • OEM Rage128 Graphics card
  • Initio Miles Ultra2 PCI SCSI card
  • Seagate Cheetah U2 SCSI 4.5 GB disk (60% full)
  • 4GB OEM SCSI drive (onboard SCSI driven) not used for these tests
  • Dual 3dfx 12MB Voodoo2 gaming cards in SLI mode (for Unreal tests) using 3dfx Beta 4 drivers.
  • OEM SCSI ZIP drive
  • OEM 24X SCSI CDROM drive
  • OS 9 (w/Altivec extensions enabled)
  • All OS 9 USB extensions were disabled (no USB in the 9600)
  • Virtual Memory and Appletalk OFF
  • Default OS 9 disk cache setting
  • Quicktime 4.03/QD3D 1.6

I'll be comparing the follow configurations:

  • The base 9600/350 system with original 350MHz 604e with 1MB L2 inline cache
  • The same system with an XLR8 G4/350/1MB G4 upgrade
  • The same system with an XLR8 G4/350/2MB G4 upgrade (Technology Demo Unit)

The 9600/350 has ROMs that would allow running Speculative Processing enabled on the G4 (I personally ran the XLR8 G4/350/1MB with it enabled for several days with no ill effects). However all tests here were run at the XLR8 software defaults which have it disabled. Since most older Macs don't have clean ROMs and will have to run with Speculative Processing disabled I decided to use the default settings. Otherwise I'd be reporting results that would not be possible (or safe) with most of the installed base of older Macs out there.

Although it's beyond the scope and focus of this article, remember that with most G4 upgrades on the market (all but the Newer Technology models), if you have a CPU card slot (pre-Apple G3) Mac you need to install the vendor's G4 software before installing the upgrade card to ensure that speculative access is disabled when first booting from the G4 upgrade. The 8600/250, 8600/300, 9600/300 and 9600/350 Macs are said to have ROMs that allow speculative access with G3/G4 upgrades, but it's still a procedure you should follow.

Application and game tests include Photoshop 5.5 with latest Altivec extensions, Lightwave 3D 5.6D, After Effects 3.1, Premiere 4.2.1, Infini-D 4.01 and Unreal 224b7.

Benchmarks used were MacBench 5.0 and two memory bandwidth tests.

I'll start with real world application tests since they are the most important. Benchmark results follow farther down the page.

Photoshop 5.5 Tests:
I ran a complete PS5Bench (21 filter test) series using Photoshop v5.5 with the Altivec extensions (active for G4 CPUs only). PSBench settings are 1024x768, millions colors, VM off, Interpolation set to bicubic (better) and Photoshop should be allocated enough RAM to avoid any swap file activity from the 10MB test image filter actions. (I allocated 140MB to Photoshop 5.5 for this review.)

As noted in my more recent PSBench tests, to eliminate scratch disk activity seen even with a 10MB image file and 140MB of RAM allocated to Photoshop 5.5, the 'History' settings were changed from the default 20 to 1, and I unchecked the 'automatically create snapshot' option. This dramatically lowered many filter times and removed all signs of disk activity during the filter tests (each filter is run 3 times).

Altivec Extensions Note: All G4 CPU tests had the 4 OS 9 Altivec extensions active as well as the Adobe current release of the Altivec Core and Lighting Effects Filter that were recently updated and publicly released.

The '2MB Cache Gain' column notes the performance benefit from the 2MB vs 1MB cache. The maximum gain seen was about 13%, often there was none. Tests with different image sizes than the 10MB test sample may show different results.

The 'Altivec Gain' column shows how much faster a G4/350/1MB upgrade was than the stock 9600/350 604ex CPU card.

The total time for the 21 filter tests shows much smaller gains than some specific filters since many filters are not Altivec enhanced, including those that took by far the longest times to complete.

Filter 2MB Cache Gain
(Benefit of extra 1MB cache size)
XLR8 G4/350/1MB
(in 9600/350)
XLR8 G4/350/2MB
(in 9600/350)
604ex 350MHz
1MB Cache
Altivec Gain
G4/1MB vs
Rotate 90° CW None 0.6 0.6 2.1 250%
Rotate 9° CW 9% 3.8 3.5 4.6 21%
Rotate .9° CW 9% 3.5 3.2 4.2 20%
1 pix Gaus. Blur 8% 1.3 1.2 2.4 85%
3.7 pix Gaus. Blur 3% 3.3 3.2 5.6 70%
85 pix Gaus. Blur None 3.7 3.7 7.6 105%
Unsharp Mask
50%/1pix/0 level
13% 1.7 1.5 3 76%
Unsharp Mask
50%/3.7pix/0 level
None 3.7 3.7 6.3 70%
Unsharp Mask 50%/10pix/5 level None 3.6 3.6 6.5 81%
Despeckle 7% 1.5 1.4 3.7 147%
RGB-CYMK None 6 6 7 17%
Reduce 60% 13% 0.9 0.8 1.9 111%
Lens Flare 7% 5.8 5.4 7.6 31%
Color Halftone 8% 5.3 4.9 7.3 38%
NTSC Colors 2% 5.2 5.1 6.7 29%
Accented Edges 2% 13.4 13.2 15.5 16%
Pointillize 2% 19.1 18.8 22.2 16%
Water Colors 2% 27.8 27.3 32.3 16%
Polar Coordinates 6% 5.2 4.9 9.2 77%
Radial Blur 5% 42.9 41 48.3 13%
Lighting Effects 3% 3 2.9 12.4 313%
Total Time (3%) 161.3 155.9 216.4 (34%)
System 2MB Cache Gain
(Benefit of extra 1MB cache size)
XLR8 G4/350/1MB
(in 9600/350)
XLR8 G4/350/2MB
(in 9600/350)
604ex 350MHz
1MB Cache
Altivec Gain
(% gain
of G4/1MB vs

The benefit in Photoshop 5.5 from the 1MB larger cache was very little (a maximum of 13%) with the 10MB test image size. Overall it improved the 21 filter test times only 3%. I actually expected more, but again perhaps the 'write-through' cache mode was a factor, as it basically negates any L2 cache write buffering regardless of cache size. If time permits, I'd like to rerun this test series with Write-Back mode enabled to see what the results are.

The good news is that even the 1MB G4 upgrade showed some impressive gains in Altivec enhanced filter performance over the stock 604ex CPU card of the same speed. Improvements of up to 313% were seen, although the total time improvement was only 34% due to the fact the most time consuming filters in the test were not Altivec enabled..

When looking at the results above, consider which filters you use most often when determining how much a G4/altivec upgrade will benefit you. Overall a 350MHz G4 is significantly faster in many filter operations than the same speed 604e CPU, despite the fact the G4 was running in write-through cache mode and the Mac was a 50Mhz bus speed model.

The overall time is of less importance if the series above includes many filters you rarely use. Weigh the gains from those filters that you use most often when making an upgrade decision.

Virtual PC 3.0 Tests
The following chart shows the results of tests with Connectix's Virtual PC 3.0 using Norton Utilities v4.0's System Info benchmarks. Virtual PC was allocated 128MB of RAM.

VPC 3.0 scores - Norton SI

The extra 1MB of backside cache, did increase the system performance according to Norton's benchmark. However it's still far below even a 166MHz Pentium system.

VPC 3 and Norton's disk scorse

The gains seen here puzzle me enough that I'm going to retest. With Write-Through mode I'm surprised the larger cache made a difference on disk writes in VPC.

Lightwave 3D 5.6D Tests
The following chart shows the time to render a 640x480 raytracing (one frame) in Lightwave 3D v5.6D.

The extra 1MB of backside cache, at least in the default 'write-back' mode didn't make any significant improvements (a few percent). However a same-speed G4 showed impressive gains over the stock 604ex CPU.

Infini-D 4.01 Tests

Infini-D 4.01 does not use Altivec extensions, but is a common application I have used for comparisons of CPU/FPU performance. I used the same 'Chapter 7 completed' tutorial scene file from my past reviews. Rendering quality was set to Ray Trace, medium anti-aliasing, shadows on, patch detail low. I didn't change the default QT movie output file options.

The graph below shows times to complete the 150 frame movie (10.6MB) rendering with the stock 9600/350 and the two G4 upgrades.

Again the extra 1MB of cache was didn't provide any significant gains (less than 5%).

Adobe Premiere 4.2.1 Tests
The following graph shows the time to produce a Quicktime movie from the "Sample Project" file (duration set to the full length of the project file). Output file settings were: Video codec (max quality), 320x240, 15 FPS, keyframe every 5 frames, 22KHz/16-Bit stereo audio.

Note: Unlike the other applications tests here that have their own timing function (eliminating human error), this test required a stopwatch to record times. There could be a 1/2 second or so reaction time variation for starting and stopping the stopwatch between runs.

The extra 1MB of cache provided slightly better than 10% increase in performance in this test.

After Effects 3.1 Tests
This graph shows the time to render a special effects movie (appx. 8MB file size) in After Effects 3.1. Resolution was set to 1024x768, thousands colors as was common on all but the Photoshop 5.5 tests.

The extra 1MB of backside cache only saved 7 seconds, literally no gain in this test.

Unreal 3D Game Performance Tests:

Unreal v224b7 (latest) with high quality texture setting was used for this test. To prevent the OEM Rage128 card from being a bottleneck, I installed two 3dfx Voodoo2 cards running in SLI (scan line interleaving) mode, and Unreal was set to use Glide rendering mode. (See my Mac Game/Video Card Framerate entry page to download my high quality game settings 3dfx Unreal.ini file that was used in the tests.)

For this test I used a pair of 3Dfx Voodoo2 cards in SLI (Scan Line Interleaving mode) so that the fill rate limited OEM Rage128 would not be a bottleneck.

The extra 1MB of cache added about 15% in performance for this test. The 1MB cache G4/350 upgrade provided about 42% higher performance than the stock 350MHz 604ex CPU in this test.

Applications Test Results Summary:

Other than Virtual PC and Unreal, most applications showed little gain from the extra 1MB of backside cache. Considering the cost (I'm told), it's unlikely that the extra cost would be worth it for most users.

Benchmark Tests

Benchmark tests were run with MacBench 5.0 as it is the accepted Mac standard. I've also included results with two memory bandwidth benchmarks which show somewhat disappointing results.

Remember the most important results are the real world applications performance above. Benchmarks have their place, but actual applications performance is what really matters.

In Macbench 5.0, a 1000 score is the baseline based on performance with an Apple Beige G3/300 running millions colors, 1152x870, so consider this when evaluating any scores at lower resolutions and color depths.

The base system ran OS 9, 320MB of Interleaved RAM, VM Off, and OpenGL 1.1.2 with its ATI driver update. All G4 tests had the 4 OS 9 Altivec extensions active and Photoshop 5.5 used Adobe's latest Altivec core extension and Lighting Effects Plugin.

MacBench 5.0 Results

MacBench 5.0 results

Notes: Some explanation of the MacBench graph and test components.

  • CPU/FPU Scores: MacBench CPU scores for the G4 upgrades were lower than I expected and less than a 350MHz G3. I suspect the combination of 'write-through' cache mode and disabled speculative access are partly responsible, although every other G4 I've tested scores a bit less than the same speed G3 (noted in my past G4 reviews). These G4 upgrades however seemed to have lower than normal FPU scores, again perhaps to the 'write-through' cache mode. MacBench does not use Altivec extensions so much of the power of the G4 is untapped.

  • Disk Scores: The 9600/350 ran an ATTO PCI Ultra2 SCSI controller card driving a Seagate Cheetah U2 SCSI 4.5GB drive. Apple Drive Setup drivers were used from the OS 9 CD. The Cheetah disk was over 60% full and not optimized.

  • Graphics Scores: The 9600/350 had an OEM Rage128 rev 1 graphics card from a rev 1 B&W G3 which has a slower clock speed than the retail Rage128 or rev 2 B&W G3 versions. ATI's Universal drivers version 4.01 with the OpenGL 1.1.2 update were used for all tests. See my video cards page for reviews of faster alternative cards that would have provided higher scores.

Memory Bandwidth Tests:

I want to caution readers to not be too concerned with the results below. Rely more on the real world application performance rather than the results of pure benchmarks like these.

I ran tests of memory performance with Memory Bench and Newer Tech's new Gauge Pro. All tests with the G4 upgrades used XLR8's v1.4.3b0 Control and Extension which defaults to write-through cache mode (slower than write-back or copy-back mode as you'll see below). XLR8 had changed the default mode to address some compatibility issues with systems like the PowerMac 9500. Since write-through is the default mode, I used it for most tests here but do show the differences provided by write-back mode for illustration purposes.

For more memory bandwidth test results see my recent G4 CPU upgrade reviews which includes B&W G3 (with and without G4 upgrades) and Apple (Sawtooth) G4/AGP system memory bandwidth performance.

Memory Bench Results:

XLR8's later G4 control software defaults to the slower "Write-Through" backside cache mode which basically disables write caching and as you'll see below, dramatically affects pure memory test results like this. Far less difference is seen in actual applications performance. XLR8 set "Write-Through" as the default mode due to some compatibility issues I'm told especially with Macs like the 9500 that have motherboard (soldered in/non-removable) cache. Previously I'd run an earlier control panel/extension version with the G4/350/1MB card fine in the 9600/350 all my usual applications. However I felt it best to use the default mode for applications tests here since some owners would have to run that mode. Consider this a worst-case scenario and remember G4 control software can change in the future, so these figures are likely to improve with future releases (non-beta) and also with later G4 CPU revisions that eliminate the errata present in versions prior to 2.8.

In the above comparison the 2MB cache allows higher read memory speed between 1MB (1024KB) and 2MB (2048KB). With write-through cache mode the benefits of the L2 cache are negated for writes, so size doesn't really matter.

Now look at the L2 cache write performance with Write-Back/Copy-Back mode enabled. Note the dramatic improvements in write performance up to the size of the backside cache. Basically Write-Through is disabling the backside cache for writes (but not reads).

Shown below are the results with the stock 9600/350 CPU card. Note it actually had higher L1 cache memory speed than the G4 in this Mac. Note that as shown in my past reviews, when G4 CPU are used in B&W G3s or G4 systems more than 2x higher L1 cache speeds are recorder as those in older Macs with lower bus speeds. A G4/400 for instance delivers about 2GB/sec L1 cache speeds.

I'll be testing both Newer Tech and PowerLogix G4 CPU upgrades in this same 9600/350 and will note their results in future reviews. A quick test with Powerlogix's G3/G4 Cache Profiler v1.3 showed similar results to XLR8's in the same cache mode.

Stream results are not shown here, but sample tests showed very low performance in general with G4 upgrades in the 9600, as low as sub-40MB/sec rates in some tests (that's 1/3 the B&W G3 results with a G4 upgrade). The final test used the built-in memory bandwidth test of Newer Tech's GaugePro utility. The results with the Sawtooth don't track the other memory bandwidth benchmarks however, as GaugePro reports the Newer G4/400 has more than 20% higher memory speeds. I'm not sure this makes sense given the results from other benchmarks.

Newer Tech's Gauge Pro Results:

Newer Tech's latest Gauge Pro utility also includes a memory bandwidth test feature. (Gauge Pro should ship with their new G4 cards.). Although the results with GaugePro show the G4 even with write-through cache mode was faster than the stock 9600/350 CPU card, the scores are still disappointing, about 1/3 the rates of a B&W G3 system.

Note G4 CPU Stepping (CPU Version) is 2.2

Now the same results with the stock 604ex 350MHz CPU card:

Note: The version of GaugePro I have (unreleased as of this date), like some other utils, does not properly recognize the kansas Apple motherboard that has no on-board L2 cache. (It reports 256K of disabled cache as shown above.) The Apple 9600/300, 9600/350, 8600/250 and 8600/300 systems have no cache on the motherboard and use a 1MB inline L2 cache on the CPU card. As noted in the FAQ and my 9600/Mach5 page, these Apple CPU cards cannot be used in other Mac models.

Benchmarks Summary:
Memory bandwidth test results were disappointing in this system. They do illustrate well the effects of the different cache modes. Thankfully many applications are primarily CPU/FPU bound and with the same speed/type of CPU, will perform about as well in an upgrade older Mac as with a new system with faster memory bus. There are exceptions, such as Photoshop 5.5 and 3D games that do show a benefit from a faster bus speed Mac. I'm sure there are other applications that move a lot of data over the bus that also show benefits to more modern Mac models with faster memory bus speeds and RAM.

Macbench scores were also a bit disappointing as they usually are with any G4 upgrade. However I suspect the combination of write-through cache mode and disabled speculative processing contributed to the lower scores. Disk performance did show a slight improvement from the larger cache, but CPU and Video were unchanged basically.

So is a 2MB cache G4 CPU upgrade worth paying more? Not really; not for most applications at least. Perhaps with "Write-Back" mode enabled there would have been slightly different results, but I doubt it would have made a large difference with both the 1MB and 2MB models using the same mode. Another factor may be that current software is not written to take advantage of larger caches.

If the price difference were say $50 (doubtful), I'd go for the 2MB model on general principle but I doubt the price delta would be that small at retail. I don't currently have any estimates of what the additional cost would be, but I suspect it would add more to the cost of the upgrade than most buyers would be willing to pay (at least educated buyers).

Thanks to XLR8 for providing the test sample, which helped answer one of the most common G4 upgrade questions since the 2MB cache support of the G4 CPU was made public.


Copyright © Mike, 1999.
All Rights Reserved.
All brand or product names mentioned here are properties of their respective companies.

Users of the web site must read and are bound by the terms and conditions of use.