Effect RAM timings change performance Athlon 64 2800+ on a platform with chipset VIA K8T80004.06.2019
Today we will do a little research depending on the performance of AMD Athlon 64 2800+ RAM timings of changes on the platform with the VIA K8T800 chipset. The task is not difficult, but prior to the testing and analysis of its results still suggest that recall the theory.
In comparison with the chipset NVIDIA nForce 3 150 gradually replaceable on the market more advanced NVIDIA nForce 3250, VIA K8T800 chipset has the advantage of a proprietary technology called «Hyper8», essentially constituting the support HyperTransport bus mode between the processor and the chipset 16 bit / 800MHz in both directions. Whereas NVIDIA nForce 3 150 exchanged between the processor and the chipset is carried over the bus of 8 bits / 600 MHz in one direction and 16 bit / 600MHz another. Another thing is that the performance impact, both positive and negative, this fact does not render at all, and at the end of the article I once again tell you about it.
In an improved version of the VIA's chipset for the AMD Athlon 64 - VIA the K8T800 Pro - the HyperTransport bus speed increased to 1000MHz. Furthermore, the possibility of fixing the PCI / AGP frequency - which is a useful innovation overclocking than "express bus".
The main technical characteristics of the VIA K8T800 chipset include:
- support for AMD Athlon 64, Athlon 64 FX, Opteron (all series and for any connectors);
- AGP 8x;
- HyperTransport bidirectional bus to the processor at a frequency of 800 MHz with 16 bit word length in each direction;
- Bus V-Link 8x (533Mb / s) to communicate with the south bridge;
- 2 four channel device Parallel ATA (ATA133);
- two support devices Serial ATA (SATA150);
- Support of two devices using Serial ATA PHY-controller (SATAlite interface);
- V-RAID to create an array of RAID-SATA-devices (JBOD, 0, 1, 0 + 1 - the latter mode, of course, only possible when connecting drives SATA-4);
- 8 USB 2.0 ports;
- 6 PCI devices;
- Fast Ethernet MAC-controller (up to 100 Mbit / s);
- AC'97 interface for audio codecs (6 channels);
- MC'97 interface for modem codec;
- LPC-bus for legacy peripherals.
(Considering discrete VIA chipset, you need to add that functionality Southbridge is given for VT8237).
So, initially in the testing process were subjected to a change in the following timings:
- CAS# Latency (tCL);
- RAS# to CAS# Delay (tRCD);
- Row Precharge (tRP);
- Cycle Time (Tras).
(In that order and timings described in the diagram).
Let me remind you a brief description of timings of RAM *:
# Latency CAS (tCL) - parameter control (on-period clock) time delay which occurs before the time when the memory starts to execute a read command after receiving it. Also defines "the cycle timer" value to complete the first part of the packet. The lower the latency, the faster the transaction takes place. It can be set to 2; 2,5 and 3.
# CAS to RAS # Delay (tRCD) - option to set the delay between the signals RAS (Row Address Strobe) and CAS (Column Address Strobe). Simply put - the delay that occurs when something is written, updated or read into memory. Naturally, the decrease of this parameter leads to an improvement in productivity and an increase on the contrary, to reduce it. Selection of feasible values: 2; 3 and 4.
Precharge Row (tRP) - precharge time. This option sets the number of cycles required to RAS has gained its charge before the SDRAM refresh. Typically, reduction precharge time improves SDRAM performance. Valid values are 2; 3; four.
Time The Cycle (Tras) - feature that allows you to change the minimum number of memory cycles required for the Tras and Trc. Tras means SDRAM`s Row Active Time (SDRAM row on-time), i.e. the time period during which the number of open for the data transfer. There is also a term Minimum RAS Pulse Width (minimum duration pulse RAS). Trc, on the other hand, means SDRAM`s Row Cycle Time (memory cycle / cycle time of a number of SDRAM), ie the period of time during which to complete a full cycle of the opening and renovation of a number. In most BIOS motherboards based on VIA K8T800 chipset possible a wide range of choices between values from 5 to 15.
* a source.
Testing was conducted in the following configuration of the system unit:
- Motherboard: Micro-Star K8T Neo-FSR (MS-6702), VIA K8T800;
- Processor: AMD Athlon 64 2800+ 1800MHz, 512Kb, 1.5v. (NewCastle);
- Memory 1 x 512Mb PC3200 400MHz 2.6v, (Patriot);.
- Video Card: ATI Radeon 9800SE 128Mb @ 9800Pro 430 / 730MHz;
- Hard disk: 164.7Gb SATA150 Hitachi 7200rpm 8Mb;
- Drive: DVD ± R / RW & CD-RW NEC ND-2510A;
- Corpus: INWIN-S508 + 420W power unit (Thermaltake-W0009) + two housing 80mm cooler Zalman (~ 1700 rpm, 7v.).
Operating System: Windows XP Home SP1. System VIA Hyperion v.4.53 driver, DirectX version - 9.0c, Catalyst 4.11. All unnecessary services have been disabled. No additional software or install ( "pure" Trey). The system is set for maximum performance.
Cool'n'Quiet technology during testing has been disabled in the BIOS Setup of the motherboard.
BIOS settings of the motherboard Micro-Star K8T Neo-FSR (MS-6702) in the part related to the memory and chipset, are not distinguished by a special wealth:
Despite the latest available version of the BIOS, the choice of the parameter «Bank Interleaving» can only be of «Auto» and values «Disabled».
As a test, the following synthetic benchmarks were chosen programs and games:
- SiSoft Sandra-2004.10.9.133;
- Everest v.1.52.215;
- CrystalMark v.0.9.106.215;
- PCMark’04 build 1.2.0;
- Super PI;
- WinRar v.3.4;
- 7-Zip v.4.09;
- Lame v.3.96;
- CINEBENCH 2003;
- Unreal Tournament 2004 build 2225;
- Far Cry v.1.3 build 1337;
- DOOM 3;
- 3DMark’03 build 3.5.0.
All tests were performed at least twice. If any result "dropped out" (that is significantly different from the previous one), then the test was performed in addition at least one more time.
Popular synthetic benchmark unexpectedly practically indifferent to change timings RAM. Gain change from maximal timings combination 2-2-2-9 brings only less than 1 % performance.
Everest-and the results also do not reflect the significant productivity gains. If you only pay attention to the operation record in the memory - the total increase amounted to 6,6 %. Let's see the effect of a decrease in the benchmark timings «Latency» (less - better):
Not bad! Particularly noticeable increase when changing CAS # Latency (tCL) from 2.5 to 2 and the decrease parameter RAS # to CAS # Delay (tRCD) from 3 to 2. The difference between the results at the maximum and minimum latencies of 19,2 % and , I'll tell you what is the maximum % th gain value from all of the benchmarks.
Both of the latter synthetic benchmark of this article did not reveal any noticeable performance gains by reducing memory latency. Let's see how to behave like real applications and games.
When calculating a 2Mb Super PI gain over a minimum timings maximum of 4 seconds or 3,4 %. Not much, of course, but suggested that a "long distance" will break more.
Especially noticeable increase in WinRar can be observed with decreasing RAS # to CAS # Delay (tRCD), and the total difference between the results at the minimum and maximum latencies is 18 %!
This archiver difference is only visible when the package files. When 7-zip unpacking timings to RAM indifferent.
Before carrying out tests in the games, it is necessary to say that in order to minimize the impact of video card performance tests were carried out at 640x480 and the maximum speed Catalyst driver settings (AA off, AF off).
Testing was conducted on BotMatch «Rankin».
According to the result of the game is difficult to identify which of the timings most strongly affects the performance of the processor. Each of the parameters when it is decreasing makes a small contribution to the common cause, which in total results in +7,6 %.
As it turned out testing GeForce 6600GT production LeadTek, demo from 3DNews on «Research» level is very CPU-we just need. The benchmark was run twice.
Once again, one can observe that the maximum performance gain decrease occurs when RAS # to CAS # Delay (tRCD) and thus a total of 2.3FPS (c 61.2FPS to 63.5FPS). The total increase productivity while reducing memory latency in Far Cry equal to 7,2 %.
Testing was performed on a standard demo1 two passes.
By analogy with Far Cry, RAS # to CAS # Delay (tRCD) in DOOM 3 most "influential".
And in the end will bring results 3DMark'03 CPU Benchmark.
3DMark'03 CPU Benchmark confirmed correctness Far Cry DOOM and 3 with respect to the parameter RAS # to CAS # Delay (tRCD) - this is the most significant and most influencing the timing of memory performance for VIA K8T800 platform chipset.
Surprised by the state of the account? Ugh! Sorry 🙂 surprised by the lack of test results in compression of audio into MP3 format and codec Lame rendering CINEBENCH 2003 present in the list of benchmarks? I have not forgotten their lead, but just did not load extra paper charts. The fact that the change in timings as does not affect both the application data (+0,4 % CINEBENCH in 2003 does not count).
Here the final results of performance gain with a decrease in combination with the timings 2-2-2-9 to 3-4-4-11 in one table:
|Test’s||Gains from changes to the timings 3-4-4-11 to
|Sandra 2004.10.9.133 Memory||Int Buff||+0,9 %|
|Float Buff||+0,9 %|
|everest v.1.52.215||Read||+1,7 %|
|CrystalMark v.0.9.106.225||Memory Score||+6,0 %|
|PCMark'04 v.1.2.0||Memory Score||+1,0 %|
|Super PI, 2Mb||Time||+3,4 %|
|WinRAR v.3.4||KB/sec||+18,0 %|
|7-Zip v.4.09 (MIPS)||Then||+4,4 %|
|Lame 3.96||320 kbit/s||0,0 %|
|CINEBENCH 2003||Makes. 1 CPU||+0,4 %|
|UT2004||640 × 480||+7,6 %|
|Far Cry||+7,2 %|
|DOOM 3||+7,8 %|
|3DMark 2003||CPU Score||+4,8 %|
|Average growth:||+5,3 %|
We are also attentive readers may have questions about due to the absence of tests with a further decrease of the parameter Cycle Time (Tras) from 9 to 5. The lowest possible tests were conducted with a total possible range of 11 to 5, and you can see for yourself by reading complete one results table (10.4Kb). But since the increase in productivity while reducing the Cycle Time (Tras) below 9 does not occur, and lead them, in my opinion, it is not necessary.
Similarly (i.e. does 🙂 ) responsive system and to change the DRAM Burst Length - length packet indicating a number of data units sent in one transmission cycle. Ideally, one transmission will fill one memory line in the L2 cache of modern processors. That is, it must be equal to 64 bytes, or eight data packets. The two possible values - 4 and 8 - tested.
As I mentioned earlier in this article, further studied the effect of the changes on performance VIA K8T800 chipset parameters: the LDT to the AGP Lokar (the Upstream) 8bit or 16bit and the LDT to the AGP the Width (the Downstream) 8bit or 16bit. Tests were also carried out and lowered with defoltovyh 800MHz to 600MHz HyperTransport bus frequency. The test results are present in table above. Change - 0.0.
The next step is to study the performance gain from the two-channel mode with RAM in a system Athlon 64. But it will be another article, and, of course, on the other platform.
Good luck to you!
Your comments and suggestions on the topic of the article propose to discuss in a specially created branch conference.
Sergey Lepilov aka Jordan