Click for Portable Storage Solutions!
Interleaved RAM: What Applications Benefit & How Much?
Published 6/10/99 Updated for PCPro tests
After receiving the first shipping ZIF socket based CPU slot card (the XLR8 CarrierZIF), where were reports that certain Motorola ZIF modules might not work reliably with RAM interleaved when used in the card. Although my sample was a first build version and contained an XLR8 G3/400Z ZIF module, I was curious to see if this was a common problem. Based on my tests in two different Mac motherboard designs that both support interleaved RAM, I'm confident this is not an issue with most retail ZIF modules or OEM Apple modules.
During the tests with and without interleaved RAM, I revisited the issue of how much gain is seen in actual use from interleaved RAM. Almost two years ago I saw zero gain in Macbench, but considering most of its tests fit in the backside cache (except for the graphics and disk tests), this really can't be a good indication of real world benefit. There have been numbers thrown around on the web over the years (rates like 3 to 4%, or 10%), but I was curious to see just how much and if, as I suspect, it would vary depending on how much the application was moving data over the memory bus.
Owners of Apple's G3 systems (Beige and Blue and White G3 or later Macs) can skip this article - they use SDRAM based motherboards which do not interleave RAM by slot banks (on the DIMM RAM is interleaved however). Some older Mac models such as the PowerCenter/Pro also do not interleave RAM so this article is only of curiosity interest to those owners. Also note that the "Mach5" (aka "Kansas" motherboard based Macs -8600/250, 8600/300, 9600/300, 9600/350) also have lower than expected Memory performance. This was noted in my previous Mach5 page.
What is Interleaving?
Many Macintosh motherboards (i.e. 7300-7600, 85/8600, 95/9600, PowerTower Pros, S900's Genesis) have a memory controller that supports interleaved RAM. The RAM slots are identified with a letter/number ID - for example A1, B1, A2, B2, and so on. By installing identical RAM modules in the matching An/Bn slots (i.e. A1/B1, A2/B2...), you enable interleaved access to RAM as two banks (A and B). Basically the benefit is to reduce wait states and contention when accessing main memory. This provides potentially higher bandwidth (throughput) on the memory bus. An analogy might be of a box with two openings (one on each side) versus one with only a single opening. If someone were putting items into or removing them from one opening, another person could still have access via the second opening. This is a crude simplification but you get the general idea. Apple also has info on Interleaved RAM in their Power Macintosh Memory FAQ - here's a clip from the section on Interleaved Memory:
4) Question: What is memory interleaving and what advantage does it provide?
Answer: Even though the system data bus is 64 bits wide, the memory controller in Power Macintosh 7300, 7500, 7600, 8500, 8600, 9500, and 9600 computers can support 128 bit data read and write operations by interleaving data between corresponding DIMMS.
Memory interleaving provides higher bandwidth (MBytes per second) between the PowerPC microprocessor and main memory. It also provides a significant performance boost, increasing the execution speed of memory-intensive programs. How much faster depends on the program's software architecture and whether an L2 cache is present.
5) Question: How is memory interleaving enabled?
Memory interleaving is a function of the memory controller used in Power Macintosh 7300, 7500, 7600, 8500, 8600, 9500, and 9600 computers. Memory interleaving is enabled by the power-up software when it detects two DIMMs in corresponding expansion slots (such as, A1 and B1, A2 and B2, and so on) that are the same density, have the same memory bank configuration, and have the same DRAM addressing modes.
- 9600/350 base (Mach 5/Kansas motherboard)
- XLR8 CarrierZIF with XLR8 G3/400Z ZIF Module
Set to 412MHz CPU, 206MHz Cache & 55MHz Bus speed
- 320MB RAM (2x 32MB Original EDO DIMMs, 4x 64MB FPM DIMMs)
- OEM 4GB SCSI Hard Drive (75% full), OEM SCSI ZIP drive
- Kenwood TrueX 52x IDE CDROM
- OS 8.1, Standard Extension set (not trimmed), no Libmoto/Speed Dblr
- Virtual Memory Off, 4MB Disk Cache
- PCI Cards Installed:
- Radius Thunder 3D (Primary Graphics card)
- ATTO ExpressPCI SCSI (no drives attached)
- Microconversions 12MB Game Wizard Voodoo2
- Promax TurboMax PCI IDE Controller (for IDE CDROM)
Tests were run with the exact same configuration except for moving DIMMs to enable/disable interleaving. Applications were given enough RAM (Photoshop 80MB, Unreal 91MB, Quake1 55MB, etc.) so that disk access was minimized.
Applications That Gained from Interleaving:
As I suspected, applications that move a large amount of data like Photoshop and Unreal showed some benefit from interleaving RAM. In fact I was a bit surprised at Unreal's boost - I reran the tests 3 times to verify the readings.
I expect that the gain from Photoshop 5 would have been higher if the test file size had been larger than 10MB (the standard PSBench 5 test size, used to ensure no disk activity/swap file use affects the scores). Results are in seconds, so lower numbers (in bold) are better. Photoshop v5.02 was used, with 80MB RAM allocated to Photoshop. As per instructions, screen mode set to 1024x768, million colors, interpolation set to bicubic (better).
Photoshop 5 Test Results
PSBench Filter RAM
Rotate 90 1.0 1.2 Rotate 9 3.9 4.6 Rotate .9 3.6 4.2 Gaussian Blur 1 1.8 2.2 Gaussian Blur 3.7 4.7 5.9 Gaussian Blur 85 6.3 8.1 Unsharp 50/1/0 2.3 2.7 Unsharp 50/3/7/0 5.3 6.5 Unsharp 50/10/5 5.4 6.7 Despeckle 2.7 2.9 RGB-CMYK 5.4 5.5 Reduce Size 60% 1.6 1.8 Lens Flare 5.5 6.3 Color Halftone 6.4 7 NTSC Colors 4.4 4.6 Accented Edges 11.8 12.3 Pointillize 18.4 18.9 Water Color 24.7 25.5 Polar Coordinates 6.9 7.8 Radial Blur 36.7 38.5 Lighting Effects 10.1 10.6 Total Time (sec.) 168.9 183.8 RAM Mode: RAM
Overall results with the 10MB File size 21 Filter test series was about 9% gain from interleaving. Again, I suspect larger file sizes would show more gain.
The most dramatic improvement (appx. 22% at 640x480) was shown by the popular game Unreal as noted below:
Unreal 1.02B3 Castle Flyby Tests Tests Resolution Interleaved
640x480 38.01 fps 32.74 fps 800x600 31.13 fps 27.95 fps
Note at 800x600, the delta was much smaller, indicating that this graphics card fille rate was becoming a constraint. (As resolutions rise, the graphics card becomes a bottleneck to performance.)
What Applications Didn't Benefit?
I didn't expect applications that deal with small files or that do not move a lot of data to benefit, but I did find it interesting that 3Dfx Quake v1.09 showed zero gain from interleaved RAM. Even in 800x600 models were fill rates are much higher (1.6x typically) than standard 640x480 mode. I used the fastest game card currently available for the 9600, a Game Wizard 12MB 3DFx Voodoo2 card since the 9600's graphics card (a old Radius Thunder 3D) would not run RAVE games properly. (NOTE - these tests were run before more demanding 3D games like Quake3, etc. were released. Later Quake versions would have shown different results.)
3Dfx Quake 1.09 Timedemo 1 Tests Resolution Interleaved
640x480 47.7 fps 47.7 fps 800x600 31.0 fps 31.0 fps
Although 3Dfx Quake showed zero gain, as noted above the more demanding (in system resources, texture use, etc.) Unreal showed significant gains from interleaved RAM. The same may be true of future games based on the Unreal engine and more demanding games as well (Quake II, Descent III, Half-life).
What about Benchmarks?
I saw very little gain (1% to 3% almost within the run to run variation potential) in most of MacBench 5.0's benchmarks. Note the graphics test (1024x768, thousands colors) showed the largest benefit at 3%. The CPU/FPU tests are really too small to show any benefit as the IO rate provided by non-interleaved RAM is not a limiting factor in those tests. For the disk tests, as usual the drive itself is the bottleneck.
Since I didn't really expect MacBench to show any difference in its CPU related tests, I used a memory bandwidth benchmark to show what effect interleaving would have on maximum read and write speeds to main memory. Note that while test data sample sizes are smaller than the backside cache (1MB in this case), you won't see any significant difference.
Note the dramatic improvement in bandwidth shown with RAM interleaved (left) after the data sizes exceed what can be held in the backside cache.
PowerCenter Pro Results: The PowerCenter Pro motherboard does not support memory interleaving and even with a 60MHz bus (fast by older mac standards) note the low rates once the data sample exceeds the size of the backside cache:
Note compared to the 9600 interleaved RAM rates once the backside cache size is exceeded, rates are almost 50% lower with the PowerCenter Pro (despite a 5MHz faster bus and faster G3 CPU card). This is one of the main reasons 3D game framerates are lower in the PowerCenter Pro that other macs that support memory interleaving in my opinion. (See my Searchable FPS Database for examples and comparisons by game title/system/video card)
Apple's B&W G3 Systems: The 100MHz system bus and internal (to the dimm) interleaving of the SDRAM in the new Apple G3s is shown in the following graph:
As noted on the B&W G3 Performance page and G3 Apps Tests page, many real world application don't mirror this advantage since most of the time the CPU is working with data sizes that fit in the backside cache. Exceptions are 3D Games and apps like Photoshop that move a lot of data over the bus.
The Bottom Line:
What I learned from all this is that if at all possible interleave your RAM if you're dealing with applications that move a lot of data or work with large files. For owners of Apple's G3 Macs this is not an option as the motherboard does not interleave RAM but more than makes up for it with the faster memory bus (which provides higher bandwidth than older Macs even with interleaving).
Remember that interleaving may not be reliable if you've got a mixed batch of DIMMs in your Mac. It requires matched (same size) pairs at least.
All my computer life I've understood the importance of quality RAM and buying it in pairs at least (more if possible). This can be critical in the case of interleaving RAM with faster Bus Speeds and G3 CPU upgrades. Due to the fact I've always bought known good quality RAM, in pairs, from dealers I trust I've never had to deinterleave RAM in get a CPU card to work. Back in 1997 I did toss out some OEM 8MB DIMMS from my 8500 as they did not want to coexist period with a new pair of 64MB Dimms, but I never deinterleaved RAM. If you do have problems, don't be shy to try removing DIMMs, especially older or mixed manufacturer's brands. Often they are 4K refresh rates, which in my experience don't mix well with the more standard/preferred 2K refresh rates of most aftermarket Mac DIMMs. Although the Mac memory bus was designed for 70ns (nanosecond) DIMMs, I personally have never bought them (thankfully they have been rare for a long time since 60NS or faster chips have been the low end for many years).
Back to XLR8YOURMAC.COM
Your Source for the best in CPU/SCSI/VIDEO card reviews, daily news, and more!
Copyright © 1999.
No part of this site's content is to be reproduced in any form without permission.
All brand or product names mentioned here are properties of their respective companies.
Disclaimer: Users must read and are bound by the Site Terms & Conditions of Use.