SpoonMuffin
New Member
- Joined
- Jan 11, 2007
- Messages
- 318 (0.05/day)
Processor | x2 4000+@2.96gz, thermalright ultra90 with 90mm panaflow ultra speed |
---|---|
Motherboard | biostar tforce 550 |
Cooling | Thermalright ultra90,90mm panaflow ultra speed fan |
Memory | Transend AxeRam ddr2 800@986mhz 5-5-5-15 2t TrFc 75ns |
Video Card(s) | x1900xtx+zalman vf900cu cooler |
Storage | 80+160+200+250gb |
Display(s) | 20.1in gaming lcd 1600x1200@120hz(dvi-d) |
Case | cg-briza (thx again to ashen) |
Audio Device(s) | realtek 8ch hd audio latist drivers |
Power Supply | 400watt fsp 12cm fan psu |
Software | windows 2003 sp2(custom build),nod32,opera9,tcmp,vlc,udrfrag,perfect disk,utorrent |
http://www.chip-architect.com/news/2003_04_20_Looking_at_Intels_Prescott_part2.html
http://en.wikipedia.org/wiki/EM64T#Differences_between_AMD64_and_Intel_64
http://www.chip-architect.com/news/...tation may cost less then 2 % extra die space
Also Intel has 36 bits physical address size whereas AMD has 40 bits
so if i understand this correctly, intels "64bit" truely is the hackjob that many amd fans have been saying it is from the start.
weird how they got all that $ for dev and they cant even make a true 64bit core......explains why amd wins when running in 64bit mode when compared to intel
now we need g2 and k8L dam it!!!
Second integer core for 64 bit processing (not for multithreading)
It is as good as sure that the second 32 bit core is exclusively used for 64 bit processing, and in a way similar to the good old bit slices. There was the 4-bit AMD 2901 that could be used to build 16, 32 or 64 bit processors. The fact that makes it possible is because the core's is limited mainly to additive and logic functions. A 64 bit staggered addition will take a total of four 1/2 cycles but you can start two of them back to back on 1/2 cycle intervals. The latency to access the cache also does not need to be increased because of the extension to 64 (48) bit addresses. The higher part of the address is only used several cycles later to check the address tags with the TLB entries and not to access the data cache itself. What will increase with one cycle is the latency from an ALU instruction to a normal speed integer instructions. This delay will increase from 2 to 3 cycles. One extra pipeline stage is needed as well, resulting in a minor increase in the branch miss prediction penalty.
The reason that we can be so sure that the second core is not used to boost the 32 bit Hyper threading capabilities is the scheduler. This unit is by far the biggest entity on the Pentium 4 die. It is larger then all the Floating Point, MMX and SSE hardware together. It is not only big but it also consist mostly out of very timing critical optimized macro cells laid out by hand. It takes a lot of time and effort to change the scheduler. We've looked to it in detail and concluded that it has mainly remained unchanged on Prescott's die. This means that the maximum uOp throughput remains six per cycle using the same dispatch ports as the Pentium 4.
http://en.wikipedia.org/wiki/EM64T#Differences_between_AMD64_and_Intel_64
There are a small number of differences between each instruction set. Compilers generally produce binaries that target both AMD64 and Intel 64, making the differences mainly of interest to compiler developers and operating system developers.
[edit] Currently
* Intel 64's BSF and BSR instructions act differently when the source is 0 and the operand size is 32 bits. The processor sets the zero flag and leaves the upper 32 bits of the destination undefined.
* AMD64 supports 3DNow! instructions. This includes prefetch with the prefix 0F followed by opcode 0D and PREFETCHW, which are useful for hiding memory latency.
* Intel 64 lacks the ability to save and restore a reduced (and thus faster) version of the floating-point state (involving the FXSAVE and FXRSTOR instructions).
* Intel 64 lacks some model-specific registers that are considered architectural to AMD64. These include SYSCFG, TOP_MEM, and TOP_MEM2.
* Intel 64 supports microcode update as in 32-bit mode, whereas AMD64 processors use a different microcode update format and control MSRs.
* Intel 64's CPUID instruction is very vendor-specific, as is normal for x86-style processors.
* Intel 64 supports the MONITOR and MWAIT instructions, used by operating systems to better deal with Hyper-threading.
* AMD64 systems allow the use of the AGP aperture as an IOMMU. Operating systems can take advantage of this to let normal PCI devices DMA to memory above 4 GiB. Intel 64 systems require the use of bounce buffers, which are slower.[b/]
* SYSCALL and SYSRET are also only supported in IA-32e mode (not in compatibility mode) on Intel 64. SYSENTER and SYSEXIT are supported in both modes.
* Near branches with the 66H (operand size) prefix behave differently. One type of CPU clears only the top 32 bits, while the other type clears the top 48 bits.
http://www.chip-architect.com/news/...tation may cost less then 2 % extra die space
Basic 64 bit integer operations
A 64 bit extension by itself does not imply that the Integer Execution Unit and the Integer Register File have to be extended to 64 bit. A minimal implementation would simply use the 32 bit integer pipeline for 64 bit integer operations. The Floating Point/MMX/SSE pipelines are already 64 bit. No need for changes here.
The dual 'Rapid Execution' Units and the 32 bit register file run a twice the frequency and are together able to handle two 64 bit operations per cycle. (The Hammer is able to do 3 per cycle but its 64 bit additions might have twice the latency) The mechanisms to decode an operation into 2 sub-operations are already available in the pipeline. The 128 bit XMM/SSE operations for example are handled in two 64 bit pieces.
It would be advantageous if the basic functional timing of the rapid executions engines can remain the same. The current ones handle 32 bit additions as two skewed 16 bit ones. the 2nd addition starts 1/2 a cycle after the first when the carry bit is available. The newer integer ALU's seems to be fully 32 bit ALU's The same trick may thus be used to handle a 64 bit addition as two skewed 32 bit ones. Hardware for a full 32 bit addition takes about 15-20% longer as that for a 16 bit addition. It seems that Intel's circuit designers have closed this gap with novel design techniques like 'forward body biasing' et-cetera.
Also Intel has 36 bits physical address size whereas AMD has 40 bits
so if i understand this correctly, intels "64bit" truely is the hackjob that many amd fans have been saying it is from the start.
weird how they got all that $ for dev and they cant even make a true 64bit core......explains why amd wins when running in 64bit mode when compared to intel
now we need g2 and k8L dam it!!!